Patterns for Building Production Grade Agentic AI using Multi-Agent Systems , Patterns 1–5

Ali Arsanjani
13 min readJun 3, 2024

--

Multi-Agent Systems (MAS) are revolutionizing business operations through several key transformations. We will examine them in the first section, then in the second section introduce patterns 1 through 5 of the Pattern Language for Multi-Agent Systems that our team has been innovating in mining and leveraging.

So we will be exploring the architectural decisions needed to design agentic AI systems. This first set will cover the higher level, planning and strategy aspects of Agentic AI. Subsequent patterns will delve into the more technical and architectural aspects of the design and implementation of Agentic AI and Multi-Agent Systems or really, Multi-Agent GenAI.

One of the first topics to discuss and set aside is to distinguish between LLMs, MAS, agent-based systems, and Agentic AI Systems. I will later elaborate the finer distinctions between the latter, but at a high level, when comparing MAS and LLMs, several key differences in their design, capabilities, and operational philosophies should be noted.

Agent-based systems typically involve a single LLM model that acts as an agent, performing multiple tasks, making decisions, and interacting with its environment ( sense, perceive , act) through ‘tools’ — if it were a robot it would be actuators. The LLM can be seen as a monolithic entity with various capabilities.

Multi-agent systems evolve from agent-based systems by introducing multiple specialized LLMs. Each LLM acts as an independent agent with a specific role or expertise. These agents collaborate, communicate, and coordinate their actions to solve complex problems that a single agent cannot handle efficiently. They are characterized by:

  • Decentralized decision-making: MAS consist of multiple intelligent software agents working together in a dynamic environment. Instead of relying on a central decision-maker, agents collaborate and negotiate, leading to more flexible and robust procurement strategies.
  • Adapting to complexity: e.g., the complexity of procurement, with numerous suppliers and shifting market conditions, makes it an ideal application for the distributed problem-solving power of MAS.

Autonomy is the first distinctive characteristic . Or rather, semi-autonomous operation based on goal directed behavioral specification. LLMs can empower Agents to ‘think’ and when they sense their environment, they can reactively or proactively make decisions based on their underlying LLM ‘brain’ and in combination with their reasoning, planning and adaptiveness. They can accomplish tasks by collaborating with other agents (see the later patterns), and reaching out to interact with the real world through their ‘tools.’

Specialization is another key one. MAS are engineered with a focus on specific tasks and exhibit dynamic process optimization capabilities. They are tailored to excel in particular domains, be it complex systems control, supply chain management, or autonomous driving. Each agent within a MAS has a defined role and contributes to the overall goal through specialized functionality. In contrast, LLMs are built for versatility and offer broad coverage of language-related tasks. They can understand and generate human language, perform reasoning, and provide information across a wide range of topics, making them adaptable to diverse use cases.

Real-time adaptation is inherent in MAS, which continuously monitor and adjust their operations based on feedback and changing conditions. This enables them to react swiftly and maintain optimal performance, especially in dynamic environments. LLMs, on the other hand, typically operate based on static training data and may not inherently adapt their behavior in real time unless specifically designed for incremental learning or reinforcement learning paradigms.

The concept of collaboration and distribution sets MAS apart from LLMs. MAS employ structured collaboration and task distribution, where multiple agents work together, each contributing unique capabilities and sharing information to achieve a common goal. This distributed problem-solving approach allows MAS to tackle complex, large-scale challenges. Conversely, LLMs largely function as individual, monolithic systems, processing information in isolation without inherent mechanisms for structured collaboration or task distribution.

MAS also excel in proactive decision-making through the use of predictive analytics and machine learning techniques. By analyzing patterns and trends, MAS can anticipate potential issues or opportunities and take proactive measures. This capability enhances their responsiveness and adaptability, enabling them to stay ahead of potential disruptions. While LLMs can provide valuable insights, their responses are often based on historical data and patterns, resulting in reactive rather than proactive actions.

The differences between MAS and LLMs lie in their design philosophies and intended capabilities. MAS are purpose-built systems that emphasize specialization, autonomy, real-time adaptation, collaboration, and proactive decision-making. They are designed to optimize specific processes and exhibit dynamic behavior.

In contrast, LLMs offer general language understanding and generation capabilities, providing versatile solutions across various domains. While LLMs may not inherently possess the same adaptive and collaborative features as MAS, they serve as powerful tools for language-related tasks, leveraging vast amounts of data to deliver valuable insights and responses.

Image generated by Author

How can Multi-agent Systems Power Business Transformation

MAS can power business transformation and business outcomes through the following capabilities.

Enhanced Efficiency in businesses can be significantly achieved through Multi-Agent Systems (MAS). These systems can automate complex workflows by handling intricate processes that require coordination among multiple entities, thus reducing human error and increasing productivity. For example, in a manufacturing setting, MAS can manage production lines, schedule maintenance, and dynamically adjust operations to meet changing demands [1][2]. Additionally, MAS can optimize resource allocation by dynamically distributing resources based on real-time needs and constraints, leading to better asset utilization, reduced waste, and lowered operational costs. This is particularly beneficial in cloud computing environments where agents manage server workloads to ensure efficient resource use, maintaining performance and cost-effectiveness [3][4].

Data-Driven Decisions are enhanced by MAS through real-time data analysis and insights. Agents within MAS can continuously monitor and analyze data streams from various sources, providing businesses with up-to-the-minute insights for quick decision-making and responsiveness to changing conditions. For instance, in financial markets, agents can analyze market trends and execute trades based on current data [5][6]. Furthermore, MAS leverage predictive analytics to forecast future trends and outcomes, allowing businesses to take proactive measures. By anticipating potential issues or opportunities, companies can mitigate risks or capitalize on new prospects, such as in supply chain management where agents predict disruptions and adjust logistics plans to maintain smooth operations [7][8].

Personalized Customer Service is greatly enhanced through the deployment of Multi-Agent Systems (MAS). Intelligent agents within MAS can provide 24/7 customer support by handling inquiries, resolving issues, and guiding customers through various processes. This continuous support improves customer satisfaction and loyalty by offering immediate assistance. For instance, in e-commerce, agents can help customers find products, track orders, and provide support without human intervention [9][10]. Additionally, MAS can tailor interactions and recommendations by analyzing customer data, thus personalizing the customer experience. This approach increases engagement and sales by suggesting products based on a customer’s browsing history and preferences, making interactions more relevant and enjoyable [11][12].

Resilient Supply Chains benefit from the adaptability and optimization capabilities of MAS. These systems can make supply chains more resilient by enabling quick adaptation to disruptions. Agents can monitor supply chain activities, detect anomalies, and implement contingency plans to minimize the impact of disruptions, which is crucial during events like natural disasters or geopolitical issues [13]. Furthermore, MAS can optimize inventory levels and logistics operations by forecasting demand and coordinating deliveries. This reduces excess inventory costs and ensures timely deliveries, thereby enhancing overall supply chain efficiency. For example, in retail, agents can predict sales trends and adjust stock levels to meet customer demand without overstocking .

Collaborative Innovation is significantly accelerated by MAS through the harnessing of collective intelligence. These systems facilitate collaborative innovation by bringing together diverse expertise and perspectives. Agents can coordinate efforts, share knowledge, and contribute to problem-solving in a cohesive manner, thus accelerating the development of new ideas and solutions. In research and development, agents from different departments can collaborate on projects, pooling their knowledge to achieve breakthroughs more quickly . Additionally, by distributing tasks among multiple agents, MAS can tackle complex problems more efficiently. Each agent can focus on a specific aspect of the problem, working in parallel to find solutions faster than a single entity could. For instance, in healthcare, agents can analyze patient data, research medical literature, and suggest treatment options, speeding up diagnosis and care delivery .

Key Characteristics of Agents

To summarize here are the key features that agents exhibit.

  1. Autonomy: Operates independently, makes decisions.
  2. Goal-directed Behavior: Ability to focus on the achievement of a goal through perception, reasoning and action.
  3. Social Ability: Communicates and interacts with others.
  4. Reactivity: Perceives environment, responds to changes.
  5. Proactivity: Takes initiative, exhibits goal-directed behavior.
  6. Adaptability: Learns from experience to modify behavior.
  7. Perception: Gathers information from the environment through sense(ors).
  8. Action: Performs actions that affect the environment using ‘tools.’
  9. Learning: Improves performance over time through experience.
  10. Reasoning: Processes information, makes decisions based on logic using a reasoning loop.
  11. Planning: Ability to breakdown a complex task into smaller ones and outline a plan to execute them typically through adaptation, action using a reasoning loop.
  12. Communication: Exchanges information effectively with their environment and with other agents .

Patterns for Agentic AI in Multi-Agent Systems

Strategy Patterns

  1. Task Automation for Efficiency: MAS automates complex workflows, improving efficiency and reducing errors.
  2. Data-Driven Decision-Making: MAS analyzes large volumes of data in real-time, providing decision support.

Technical Patterns

  1. Collaborative Task Decomposition: Divides complex tasks among specialized agents, ensuring alignment with overall goals.
  2. Iterative Debate for Robust Reasoning: Enhances intermediate results through collaborative reasoning among agents.
  3. Layered Context Management: Agents manage and integrate multi-layered context information for informed decision-making.

Strategy Patterns

Pattern 1: Task Automation for Efficiency

Pattern Category: Strategic

Pattern Usage: High level corporate or project strategy.

Background/Context: Organizations often face complex workflows involving numerous tasks, leading to inefficiencies and potential errors when done manually.

Forces/Tradeoffs: Manual task execution can be slow, error-prone, and resource-intensive. However, full automation may require significant investment and may not be flexible enough for dynamic environments.

Solution Overview: A Multi-Agent System can be designed where different agents specialize in specific tasks. These agents can work autonomously or collaboratively to complete complex workflows.

Solution Details:

  1. Identify tasks within the workflow: Start by mapping out the workflow and breaking it down into individual tasks that can be automated [1][2].
  2. Design agent roles and capabilities for each task: Each agent should have a specific role and capabilities tailored to perform its designated tasks effectively [3][4].
  3. Develop communication protocols for agent interaction: Ensure agents can communicate and coordinate with each other to maintain workflow continuity [5].
  4. Implement a task allocation mechanism (e.g., centralized or distributed): Use either a centralized system where a master agent allocates tasks or a distributed system where agents self-organize [6].
  5. Monitor agent performance and adjust task allocation as needed: Continuously track performance metrics and reallocate tasks to optimize efficiency [4][5].
  6. Ensure fault tolerance to handle agent failures: Develop mechanisms to detect and recover from agent failures without disrupting the entire workflow [5].
  7. Incorporate a mechanism for handling exceptions or unexpected situations: Equip the system with the ability to handle exceptions and adapt to unforeseen circumstances [4].
  8. Continuously evaluate and optimize the system for efficiency: Regularly review system performance and make adjustments to improve efficiency [4][5].
  9. Integrate the MAS with existing systems and processes: Ensure seamless integration with current business processes and IT infrastructure [4].
  10. Provide user interfaces for monitoring and control: Develop intuitive interfaces for users to monitor and manage the system [6].

Resulting Consequences: Implementing a MAS for task automation results in improved efficiency, reduced errors, faster task completion, and better resource utilization.

Related Patterns: Workflow Automation, Resource Optimization, Distributed Systems.

LLMs Contribution:

  • Natural Language Understanding: Understand and interpret complex instructions given in natural language, making it easier to design and adjust workflows without needing extensive programming knowledge [3][4].
  • Code Generation: Generate code snippets and scripts for automating various tasks, thus speeding up the development and deployment of automation solutions [1].
  • Contextual Assistance: Provide real-time assistance and suggestions during the workflow design and adjustment phases, enhancing the efficiency and accuracy of these processes [3].

Agentic AI Contribution:

  • Autonomous Task Execution: Agents in an MAS can autonomously execute tasks assigned to them, interacting with other agents and systems as needed to complete their objectives [5][6].
  • Coordination and Communication: Efficiently coordinate with one another using predefined protocols, ensuring that complex workflows are executed smoothly and without conflicts [6].
  • Dynamic Adaptation: Agentic AI systems can adapt to changes in the environment or workflow dynamically, reallocating resources and adjusting processes in real-time to maintain optimal performance [5].
  • Fault Tolerance: These systems are designed to handle failures gracefully, with agents capable of taking over tasks from failed agents, ensuring continuity and robustness in operations [5].

Pattern 2: Data-Driven Decision-Making

Pattern Category: Strategic

Pattern Usage: High level corporate or project strategy.

Background/Context: Businesses need to make informed decisions based on accurate and timely data. However, the volume and complexity of data can overwhelm traditional analysis methods.

Forces/Tradeoffs: Manual data analysis is time-consuming and prone to human bias. Advanced analytics tools can be expensive and require specialized skills.

Solution Overview: A MAS can be used to collect, process, and analyze large volumes of data in real-time. Agents can employ machine learning to identify patterns, predict trends, and provide decision support.

Solution Details:

  1. Identify relevant data sources (internal and external): Determine which data sources are necessary for comprehensive analysis [6].
  2. Design agents to collect, clean, and preprocess data: Develop agents with capabilities to gather and prepare data for analysis [1][2].
  3. Implement data storage and management mechanisms: Ensure robust data storage solutions that allow easy access and management [3].
  4. Develop algorithms for data analysis and pattern recognition: Use machine learning and statistical techniques to analyze data [3].
  5. Train machine learning models on historical data: Use past data to train models, improving their accuracy and reliability [5].
  6. Deploy agents for real-time data monitoring and analysis: Utilize agents to continuously monitor and analyze incoming data [3].
  7. Generate reports and visualizations to present insights: Create tools for visualizing data insights, making them accessible to decision-makers [1][2].
  8. Provide alerts or recommendations based on analysis results: Develop systems to notify users of important findings and suggested actions [3].
  9. Continuously update and refine models based on new data: Regularly update models to ensure they remain accurate and relevant [5].
  10. Integrate decision support tools with existing systems: Ensure seamless integration with current decision-making frameworks [3].

Resulting Consequences: Implementing a MAS for data-driven decision-making leads to better decision-making, improved accuracy, faster response times, and identification of new opportunities or risks.

Related Patterns: Data Mining, Machine Learning, Predictive Analytics.

LLMs Contribution:

  • Natural Language Processing: Process large volumes of unstructured data, such as text from reports, emails, and social media, to extract relevant information for decision-making [3][4].
  • Contextual Understanding: Understand the context of the data and provide deeper insights by correlating information from different sources [5].
  • Interactive Reports: Generate interactive reports and visualizations that make it easier for decision-makers to understand complex data sets [1][2].

Agentic AI Contribution:

  • Autonomous Data Collection and Preprocessing: Autonomously collect, clean, and preprocess data from various sources, ensuring that the data is ready for analysis [1][2].
  • Real-Time Analysis: Continuously monitor and analyze data in real-time, providing up-to-date insights and recommendations [3].
  • Machine Learning Integration: Implement and manage machine learning models to identify patterns, predict trends, and provide decision support [3].
  • Dynamic Adaptation: Agentic AI systems can adapt to new data and update models in real-time, ensuring that the decision-making process remains accurate and relevant [5].

Note: The next section are the category of more technical, architectural and design patterns versus the previous set which were strategic .

Technical Patterns

Pattern 3: Collaborative Task Decomposition

Pattern Category: Planning

Pattern Usage: When a complex task requires the expertise of multiple agents with specialized skills.

Context of the Problem: Dividing a complex task into subtasks that can be effectively handled by individual agents while ensuring alignment with the overall goal.

Forces/Tradeoffs: Balancing the workload among agents, considering their capabilities and potential communication overhead.

Impact/Facilitation for Multi-Agent Systems: Enables efficient task execution by leveraging the strengths of individual agents and promoting collaboration.

Problem or Challenge Being Solved: Decomposing a complex task into manageable subtasks for individual agents in a multi-agent system.

Solution Overview: Utilize a global planning mechanism to analyze the overall task and decompose it into subtasks based on the expertise of available agents. Employ communication protocols to facilitate information exchange and coordination among agents during task execution.

Discussion on Impact on Multi-Agent System Use Case: Improves task completion efficiency and quality by leveraging the diverse capabilities of multiple agents.

Resulting Consequences: Requires careful planning and coordination to avoid communication bottlenecks and ensure consistent progress towards the overall goal.

Pattern 4: Iterative Debate for Robust Reasoning

Pattern Category: Planning

Pattern Usage: When intermediate results require refinement through discussion and debate among agents.

Background/Context of the Problem: Enhancing the quality of intermediate results by leveraging the collective reasoning capabilities of multiple agents.

Forces/Tradeoffs: Balancing the benefits of improved reasoning with the potential for increased communication overhead and prolonged decision-making.

Impact/Facilitation for Multi-Agent Systems: Enables agents to challenge and refine each other’s reasoning, leading to more robust and reliable outcomes.

Problem or Challenge Being Solved: Improving the quality of intermediate results in multi-agent systems through collaborative reasoning.

Solution Description: Designate specific agents or stages within the workflow for iterative debate and discussion. Allow agents to present their reasoning, challenge assumptions, and propose alternative solutions. Utilize consensus mechanisms to reach agreement on refined intermediate results.

Impact on Multi-Agent Systems: Enhances the accuracy and reliability of intermediate results, leading to better overall outcomes.

Consequences: May increase communication overhead and prolong decision-making processes.

Pattern 5: Layered Context Management

Pattern Category: Planning

Pattern Usage: When agents need to consider multiple layers of context, including overall task goals, individual agent tasks, and information from other agents.

Background/Context of the Problem: Ensuring that agents effectively utilize and integrate various contextual factors into their decision-making processes.

Forces/Tradeoffs: Balancing the need for comprehensive context awareness with the potential for information overload and computational complexity.

Impact/Facilitation for Multi-Agent Systems: Enables agents to make informed decisions that align with both individual tasks and overall system goals.

Problem or Challenge Being Solved: Managing and integrating complex, multi-layered context information in multi-agent systems.

Solution Overview: Implement mechanisms for agents to access and process different layers of context. Develop context-sharing protocols to facilitate information exchange among agents. Design agent reasoning processes to consider and integrate various contextual factors.

Impact on Multi-Agent Systems: Improves decision-making quality and ensures alignment with overall system goals.

Resulting Consequences: Requires careful design and implementation to avoid information overload and computational bottlenecks.

Concluding Remarks

Multi-agent systems offer a flexible and powerful approach to addressing complex challenges, improving efficiency, and driving innovation across various business functions. By leveraging the capabilities of MAS, businesses can enhance their operations, better serve their customers, and stay competitive in an increasingly dynamic market.

Tools and implementations

You can use Google Cloud’s Agent-builder to start your journey into Agentic AI.

References

  1. AgentCoder: Multiagent-Code Generation with Iterative Testing and Optimisation” by Zhang et al., 2023.
  2. ProAgent: Building Proactive Cooperative Agents with Large Language Models” by Chen et al., 2023.
  3. Large Language Model based Multi-Agents: A Survey of Progress and Challenges” by Guo et al., 2024.
  4. Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives” by Silva et al., 2023.
  5. AutoAgents: A Framework for Automatic Agent Generation” by Chen et al., 2023.
  6. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents” by Chen et al., 2023.
  7. A Survey on Context-Aware Multi-Agent Systems: Techniques, Challenges and Future Directions” by Hung , et al, 2024.
  8. Evaluating multi-agent coordination abilities in large language models” by Agashe et al., 2023.
  9. Multi-agent consensus seeking via large language models” by Chen et al., 2023.
  10. Playing repeated games with large language models” by Akata et al., 2023.

--

--

Ali Arsanjani

Director Google, AI | EX: WW Tech Leader, Chief Principal AI/ML Solution Architect, AWS | IBM Distinguished Engineer and CTO Analytics & ML