A System of Agents brings Service-as-Software to life READ MORE

The Promise of Multi-Agent AI

Ideas / Points of View / The Promise of Multi-Agent AI

05.24.2024 | By: Joanne Chen

In this post, I share learnings from my conversation with Chi Wang, a principal researcher at Microsoft and the creator of AutoGen.

Agents have been a cornerstone of human-computer interaction for decades, from the friendly Clippy of Microsoft Office fame to auto-suggestions in Google Docs and NPCs in video games. While these early agents hinted at the potential for personalized, goal-oriented interactions, they were limited in their ability to handle higher-level tasks. It’s only with the recent advent of LLMs that the true potential of agents has begun to be realized.

As LLM-powered agents have moved from research experiments into production, they’ve enabled increasingly sophisticated applications for both consumers and enterprises. But even the most advanced standalone agents still struggle with multi-step tasks that require navigating different contexts and managing dependencies.

This is where multi-agent systems come in. By breaking down complex problems into discrete subtasks that are handled by specialized agents, these systems offer a modular, flexible, and resilient approach to automating tasks that were previously considered beyond software’s reach. Leading multi-agent frameworks like Microsoft’s open-source AutoGen are currently powering a wide range of academic and enterprise use cases, including synthetic data generation, code generation, and pharmaceutical data science.

To better understand multi-agent systems—both their potential and their present-day limitations—I spoke with Chi Wang, a principal researcher at Microsoft and the creator of AutoGen. In this post, I’ll share some of my key learnings from our conversation.

ICYMI: This essay is a part of my new series, AI in the Real World, where I have in-depth conversations with leading AI researchers about how state-of-the art AI is being applied in enterprises. Check out our previous conversations here.

Why multiple agents are often better than one

Building reliable standalone AI agents is an open challenge. So why introduce more agents into the equation?

To answer this question, it’s helpful to go back to the origins of multi-agent cognition, which can be traced back to Marvin Minsky’s classic 1986 book, The Society of Mind. Here, Minsky proposed that human cognition arises from the interaction of numerous simple “agents”—simple entities designed to perform certain functions, such as recognizing a shape or processing emotions. He posited that by combining these agents in specific ways (into networks or “societies”), intelligent behavior could arise—a phenomenon he termed the “Society of Mind.” Minsky’s key insight was that thousands of modular minds working in concert could outperform a single monolithic mind.

Today’s multi-agent systems, with their abilities to learn, adapt, and coordinate, are the direct descendants of Minsky’s vision. By training groups of agents to collaborate and compete in pursuit of shared goals, developers can create systems that dramatically exceed the capabilities of any single agent: the same “1 + 1 = 3” effect that Minsky saw as central to human cognition.

As Chi explains, multi-agent systems offer three main benefits:

1. Modularity

Distributing complex tasks across specialized agents makes the overall system more modular. This modularity simplifies development, testing, and maintenance, as capabilities can be added or tweaked without revamping the entire system. Troubleshooting is also streamlined, as issues can often be isolated to individual agents.

2. Specialization

Think of multi-agent systems as teams of experts, each contributing unique knowledge and abilities to collectively tackle difficult problems. Tasks are broken down into components and assigned to the agent best equipped to handle them. As each agent processes its part of the task and passes information to the next, the output is progressively refined and improved. Through such specialization, the resulting systems can achieve results that generalist agents struggle to match.

This approach is conceptually similar to techniques like prompt chaining, where a human user breaks down an intricate task into a series of subtasks and iterates toward a desired outcome through conversation with the model.

Chi offers the example of a multi-agent system tasked with analyzing data and providing insights and recommendations. In this scenario, each agent focuses on a different aspect of the task: some specialize in data retrieval and presentation, others in deep analysis and insight generation, and others in planning and decision-making. This division of labor allows each agent to work on what it does best, leading to faster, more accurate outcomes.

3. Collaborative learning

In multi-agent systems, the interactions among individual agents can give rise to solutions that exceed what any single agent could achieve in isolation. By allowing agents to work together, critique one another, and share their insights, the system can develop a more comprehensive understanding of the problem at hand. This is especially valuable when dealing with complex, multifaceted issues that no single agent has the breadth of knowledge or skills to fully address.

The beauty of collaborative learning lies in its ability to generate creative solutions that might elude a more homogeneous system. As agents converse and build on each other’s ideas, they can explore a wider range of possibilities and uncover approaches that individual agents might overlook. These synergies are the key to unlocking the full potential of multi-agent systems. As inference techniques improve, such inter-agent exchanges will only become faster and more efficient. 

To illustrate this concept, Chi describes a multi-agent framework with one GPT-4 agent and several GPT-3.5 agents. In this setup, the GPT-4 agent serves as an expert “teacher” or “mentor” to the GPT-3.5 “students.” By engaging with their more advanced peer, the GPT-3.5 agents can quickly master specific tasks without the need for extensive individual training. As each agent improves through this collaborative learning process, the system’s overall capabilities grow.

Best practices for building multi-agent systems

How can builders best design applications using multi-agent systems? Chi shares some helpful insights.

1. Match architecture to problem

Choosing the right architecture is critical, as multi-agent systems introduce myriad complexities around coordination, consistency, and coherence that single-agent setups avoid. For straightforward, narrowly defined tasks, a lone agent may be the simpler, more efficient choice. Factors such as response speed, decision-making frequency, inter-agent communication needs, latency, and bandwidth all influence the decision between single and multi-agent architectures.

2. Start simple and iterate

Start simple, then scale. By deploying one or two agents initially and incrementally scaling up, developers can validate the core design and interaction patterns before introducing additional complexity. This approach also streamlines debugging and optimization, as issues can be more easily traced back to individual agents.

3. Define clear roles and responsibilities

In multi-agent systems, specialization breeds strength. Developers should adopt a divide-and-conquer approach, allowing each agent to focus on its area of expertise. This goes beyond simple prompt engineering: agents can be equipped with task-specific resources and tools, such as access to databases and specialized software, along with clearly defined rules and constraints that guide them toward desired outcomes. Effective design involves mapping out the subtasks required to achieve the overall objective, understanding their interdependencies, and assigning agents accordingly based on their specialties and the system’s evolving needs.

4. Enable flexible inter-agent communication

Seamless communication between agents is crucial, and both static and dynamic topologies have their merits. In static setups, the communication channels linking agents are predefined and unchanging. This approach prioritizes simplicity and predictability, making the system easier to understand, analyze and debug. 

Dynamic topologies, by contrast, allow agents to create and modify communication links on the fly, thus enabling them to adapt to shifting circumstances and requirements. Imagine a disaster response scenario where agents represent different emergency services. Within a dynamic topology, these agents can fluidly connect and coordinate based on real-time data like incident locations and resource needs. This adaptability enables the system to mount a more effective and targeted response to evolving crisis conditions—yet it also makes analyzing and overseeing the system more difficult.

5. Balance autonomy and control

Striking the right balance between agent autonomy and control is an ongoing challenge. Too little autonomy can result in a rigid, limited system, while too much autonomy may lead to unstable or unexpected behaviors. Adjustable autonomy, which allows for dynamic, context-dependent changes in the level of control exerted over agents, is an active area of research.

6. Design for human-agent interaction

Most multi-agent systems involve human users to some degree—which means that innovative interaction design is essential. Agents need effective mechanisms for conveying relevant information to human stakeholders, soliciting input and direction as needed, and modify their behaviors in response to feedback.

A primary design consideration is whether to present the multi-agent system to users as a unified, monolithic entity or as a collection of distinct, interacting agents. In the former case, users might interact with the system through a single interface, regardless of the number and diversity of agents operating behind the scenes. In the latter, users would need to communicate with multiple agents individually, potentially using different interfaces and interaction patterns for each.

Emerging HCI paradigms are exploring a range of possibilities for human-agent collaboration. Some envision multi-agent systems as sophisticated but essentially passive tools for executing well-defined tasks under human direction. Others treat agents as proactive collaborators: dynamic, autonomous partners that can engage in creative problem-solving alongside their human users.

7. Continuously evaluate and improve

Because multi-agent systems are modular, their individual components can be isolated, evaluated, and optimized, allowing developers to continuously refine the system’s performance. To support this process, Chi encourages builders to implement mechanisms for monitoring agent performance, identifying issues, and iterating on system design. One approach is to use dedicated agents whose sole purpose is to evaluate and benchmark the performance of other agents in the system. These specialized agents can analyze operational data (like logs), extract relevant evaluation criteria, and automatically score the performance of other agents.

8. Proactively identify and mitigate risks

Multi-agent systems present distinct safety and security challenges. The high degree of interdependence between agents means that failures or vulnerabilities in one part of the system can quickly cascade.

One common failure mode arises from conflicts between the “world models” of different agents: the core assumptions, beliefs, and representations that each agent relies on to understand its environment and objectives. If these world models fall out of sync, the system can become unstable as agents start to work at cross-purposes. A multi-agent retail forecasting system, for example, could be compromised if one agent assumes rising demand while another expects a decrease, leading to faulty inventory decisions.

The distributed structure of multi-agent systems also expands the attack surface for malicious actors. Each agent is a potential point of entry that could be breached and exploited to manipulate the broader system, using its highly interconnected nature to rapidly propagate an attack. A hacked agent could be used to feed false data to its peers, skewing their world models and triggering destructive feedback loops. Imagine a swarm of autonomous drones that is suddenly fed contradictory location data by a corrupted agent, causing them to collide in mid-air.

To defend against these threats, multi-agent systems need robust security measures at both the individual agent and network levels. Techniques like multi-factor authentication, end-to-end encryption, and hardware-based trusted execution environments can help harden agents against intrusion. Anomaly detection systems can be used to identify suspicious behavior patterns that might indicate an ongoing attack.

Where do we go from here?

Multi-agent systems hold immense promise for enabling more sophisticated, capable AI applications. As this field continues to advance, researchers are focusing on several key areas to more fully realize the potential of this exciting paradigm.

  • Advanced reasoning, planning, and problem-solving: By equipping agents with higher-level cognitive skills—such as the ability to break down multifaceted problems, explore novel solution spaces, and adapt to changing circumstances —we can expand the range and sophistication of tasks they can tackle. Techniques like chain-of-thought prompting and multi-agent debate are early efforts in this direction.
  • Multimodal interaction: As agents gain the ability to perceive, process, and generate content across multiple modalities, they’ll be able to collaborate in more natural, intuitive, and contextually aware ways. Projects like agent chat with DALLE and GPT-4V, built using AutoGen, are showcasing the potential of this approach.
  • Grounding agents in reality: For multi-agent systems to truly realize their potential, they need to be grounded in the real world rather than operating in isolation. By linking agents to physical tools and sensors, realistic virtual environments, and live data streams, we can anchor their intelligence in the tangible contexts where they’ll be deployed.
  • Automating agent orchestration: As multi-agent systems grow in size and sophistication, manually designing and tuning the roles and interaction patterns of individual agents will quickly become untenable. To address this challenge, researchers like Adam Fourney and the Microsoft Research AI Frontiers team are developing adaptive architectures and learning techniques that use LLMs to automatically configure and optimize agent-based systems. This work addresses the critical need for more robust orchestration methods as these systems become increasingly complex.
  • Safety and alignment: Perhaps the most critical consideration as multi-agent AI advances is ensuring that these increasingly powerful systems remain aligned with human values and priorities. Risks like reward hacking—where agents discover loopholes in their incentive structures—and goal misspecification—where agents pursue objectives at odds with their designers’ true intent—pose serious threats that will only grow larger as multi-agent architectures scale up in capability and complexity.

Fortunately, active research efforts are already making headway in this crucial domain. Techniques like multi-agent debate, which pits agents against each other to stress-test ideas and surface potential flaws, and recursive reward modeling, which refines agent objectives through iterative human feedback, are showing promising results.

Summing up

As researchers like Chi continue to push the boundaries of what’s possible with multi-agent AI, it’s clear that we’re only beginning to scratch the surface of what this technology can achieve. From automating complex tasks to tackling multifaceted problems that have long confounded traditional software approaches, the applications for multi-agent systems are vast.

Stay tuned for upcoming installments of “AI in the Real World,” where we’ll continue to explore the cutting edge of generative AI, including alternative model architectures, inference strategies, and software-hardware co-design.


Published on May 24, 2024
Written by Foundation Capital

Related Stories