Frontiers in Emerging Artificial Intelligence and Machine Learning

  1. Home
  2. Archives
  3. Vol. 2 No. 12 (2025): Volume 02 Issue 12
  4. Articles
Frontiers in Emerging Artificial Intelligence and Machine Learning

Article Details Page

LLM-Driven Voice Agents That Collaboratively Talk to Each Other: Towards Vocal Multi- Agent Systems

Authors

DOI:

https://doi.org/10.64917/feaiml/Volume02Issue12-05

Keywords:

Multi-Agent Systems, Voice Assistants, Large Language Models, Transparency, Explainability, Human-AI Collaboration

Abstract

Voice assistants such as Alexa, Google Assistant, and Siri have become increasingly sophisticated, integrating Large Language Models (LLMs) to deliver personalized responses. However, these platforms are constrained by single-agent paradigms that limit collaboration, transparency, and complex problem solving [1,2]. This paper proposes a novel LLM-driven multi-agent voice architecture in which multiple specialized voice-powered agents converse with one another— vocally and intelligibly—to collaboratively resolve user requests. Drawing inspiration from cooperative multi-agent systems [3,4] and human-like conversational transparency [5,6], we demonstrate a prototype in a smart kitchen environment involving culinary, nutrition, and inventory agents. Evaluation suggests improvements in explainability, task success, and user trust, though challenges remain in orchestration, privacy, and cognitive load. This research introduces vocal multi-agent systems as a new frontier in interactive AI, advancing beyond single-agent frameworks towards explainable, collaborative, and socially intelligent voice ecosystems.

References

Hoy, M.B., 2018. Alexa, Siri, Cortana, and more: An introduction to voice assistants. Medical Reference Services Quarterly, 37(1), pp.81–88.

Luger, E. and Sellen, A., 2016. Like having a really bad PA: The gulf between user expectation and experience of conversational agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5286–5297). ACM.

Wooldridge, M., 2009. An Introduction to MultiAgent Systems. John Wiley & Sons.

Jennings, N.R., Sycara, K. and Wooldridge, M., 1998. A roadmap of agent research and development. Autonomous Agents and Multi-Agent Systems, 1(1), pp.7–38.

Doshi-Velez, F. and Kim, B., 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.

Miller, T., 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, pp.1–38.

Traum, D.R. and Rickel, J., 2002. Embodied agents for multi-party dialogue in immersive virtual worlds. In Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems (pp. 766–773). ACM.

Bohus, D. and Horvitz, E., 2011. Multiparty turn taking in situated dialog: Study, lessons, and directions. In Proceedings of the SIGDIAL 2011 Conference (pp. 98–109). ACL.

Skantze, G., 2017. Towards a general, continuous model of turn-taking in spoken dialogue using LSTM recurrent neural networks. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue (pp. 220–230). ACL.

Clark, H.H. and Brennan, S.E., 1991. Grounding in communication. In Resnick, L.B., Levine, J.M., & Teasley, S.D. (Eds.), Perspectives on Socially Shared Cognition (pp. 127– 149). American Psychological Association

Downloads

Published

2025-12-17

How to Cite

Navneet Dalipkumar Magotra. (2025). LLM-Driven Voice Agents That Collaboratively Talk to Each Other: Towards Vocal Multi- Agent Systems. Frontiers in Emerging Artificial Intelligence and Machine Learning, 2(12), 45–54. https://doi.org/10.64917/feaiml/Volume02Issue12-05