Multi-Agent Systems

Part of Module 3: AI Applications

Multi-Agent Systems (MAS) represent a paradigm where multiple autonomous AI agents collaborate, compete, or coexist to solve complex problems that are beyond the capabilities of individual agents. These systems enable emergent intelligence through interaction, specialization, and collective decision-making.

Understanding Multi-Agent Systems

Multi-agent systems consist of multiple intelligent agents that interact within an environment to achieve individual or collective goals. Each agent has its own capabilities, knowledge, and objectives.

Multi-Agent System Architecture

Environment ↔ Agent 1 ↔ Communication Layer ↔ Agent 2 ↔ ... ↔ Agent N

↓ Coordination ↓ Negotiation ↓ Competition ↓ Collaboration

Core Characteristics

  • Autonomy: Agents operate independently without direct human control
  • Social Ability: Agents interact through communication protocols
  • Reactivity: Agents perceive and respond to their environment
  • Proactivity: Agents take initiative to achieve goals
  • Adaptability: Agents learn and adjust behaviors over time

Types of Multi-Agent Systems

1. Cooperative Systems

Agents work together toward shared goals with aligned interests.

  • Distributed Problem Solving: Breaking complex tasks into subtasks
  • Swarm Intelligence: Simple agents creating complex behaviors
  • Consensus Formation: Reaching agreement through negotiation

Examples: Robot swarms, distributed computing, collaborative filtering

2. Competitive Systems

Agents compete for resources or conflicting objectives.

  • Game Theory: Strategic decision-making in competitive scenarios
  • Auction Mechanisms: Resource allocation through bidding
  • Market Simulations: Economic modeling with competing agents

Examples: Trading bots, game AI, resource allocation

3. Mixed Systems

Combination of cooperation and competition based on context.

  • Coalition Formation: Temporary alliances for mutual benefit
  • Negotiation: Balancing individual and collective interests
  • Social Dilemmas: Managing cooperation vs self-interest

Examples: Supply chain management, traffic control, social networks

Agent Communication Protocols

Blackboard

Shared knowledge repository

Message Passing

Direct agent-to-agent messages

Publish-Subscribe

Event-driven communication

Contract Net

Task allocation via bidding

FIPA-ACL (Agent Communication Language)

Python - Agent Communication
from spade.agent import Agent
from spade.behaviour import CyclicBehaviour
from spade.message import Message
import json

class NegotiatingAgent(Agent):
    class NegotiateBehaviour(CyclicBehaviour):
        async def run(self):
            # Receive message
            msg = await self.receive(timeout=10)
            if msg:
                performative = msg.get_metadata("performative")
                
                if performative == "CFP":  # Call for Proposal
                    await self.handle_cfp(msg)
                elif performative == "PROPOSE":
                    await self.handle_proposal(msg)
                elif performative == "ACCEPT":
                    await self.handle_acceptance(msg)
                elif performative == "REJECT":
                    await self.handle_rejection(msg)
        
        async def handle_cfp(self, msg):
            # Analyze the call for proposal
            task = json.loads(msg.body)
            
            # Generate proposal based on capabilities
            proposal = self.generate_proposal(task)
            
            # Send proposal
            reply = Message(
                to=str(msg.sender),
                body=json.dumps(proposal),
                metadata={"performative": "PROPOSE"}
            )
            await self.send(reply)
        
        def generate_proposal(self, task):
            # Calculate cost and time based on agent capabilities
            cost = self.estimate_cost(task)
            time = self.estimate_time(task)
            
            return {
                "agent_id": self.agent.jid,
                "cost": cost,
                "time": time,
                "confidence": self.calculate_confidence(task)
            }
    
    async def setup(self):
        self.add_behaviour(self.NegotiateBehaviour())

Coordination Mechanisms

1. Centralized Coordination

A central coordinator manages agent interactions and task allocation.

Python - Central Coordinator
class CentralCoordinator:
    def __init__(self):
        self.agents = {}
        self.tasks = []
        self.assignments = {}
    
    def register_agent(self, agent_id, capabilities):
        self.agents[agent_id] = {
            'capabilities': capabilities,
            'status': 'idle',
            'current_task': None
        }
    
    def allocate_task(self, task):
        # Find best agent for task
        best_agent = None
        best_score = float('-inf')
        
        for agent_id, agent_info in self.agents.items():
            if agent_info['status'] == 'idle':
                score = self.calculate_match_score(
                    task, 
                    agent_info['capabilities']
                )
                if score > best_score:
                    best_score = score
                    best_agent = agent_id
        
        if best_agent:
            self.assign_task(best_agent, task)
            return best_agent
        return None
    
    def assign_task(self, agent_id, task):
        self.agents[agent_id]['status'] = 'busy'
        self.agents[agent_id]['current_task'] = task
        self.assignments[task['id']] = agent_id

2. Distributed Coordination

Agents coordinate through local interactions without central control.

Python - Distributed Consensus
class ConsensusAgent:
    def __init__(self, agent_id, neighbors):
        self.id = agent_id
        self.neighbors = neighbors
        self.state = None
        self.proposals = {}
    
    async def propose_value(self, value):
        # Phase 1: Propose
        proposal = {
            'proposer': self.id,
            'value': value,
            'round': self.current_round
        }
        
        # Send to all neighbors
        responses = await self.broadcast_proposal(proposal)
        
        # Phase 2: Accept
        if self.has_majority(responses):
            await self.broadcast_accept(value)
            return True
        return False
    
    async def paxos_consensus(self):
        # Simplified Paxos implementation
        while not self.consensus_reached:
            if self.is_proposer():
                # Prepare phase
                proposal_num = self.get_proposal_number()
                promises = await self.send_prepare(proposal_num)
                
                if len(promises) > len(self.neighbors) / 2:
                    # Accept phase
                    value = self.select_value(promises)
                    accepts = await self.send_accept(proposal_num, value)
                    
                    if len(accepts) > len(self.neighbors) / 2:
                        self.consensus_value = value
                        self.consensus_reached = True
            
            await self.handle_messages()

3. Market-Based Coordination

Using economic principles for resource allocation and task distribution.

Popular Multi-Agent Frameworks

AutoGen (Microsoft)

Framework for building LLM-based multi-agent applications.

Python - AutoGen Example
from autogen import AssistantAgent, UserProxyAgent, GroupChat

# Create specialized agents
planner = AssistantAgent(
    name="Planner",
    system_message="You are a planning expert. Break down tasks into steps.",
    llm_config={"model": "gpt-4"}
)

coder = AssistantAgent(
    name="Coder",
    system_message="You are a Python expert. Write clean, efficient code.",
    llm_config={"model": "gpt-4"}
)

reviewer = AssistantAgent(
    name="Reviewer",
    system_message="You review code for bugs and improvements.",
    llm_config={"model": "gpt-4"}
)

user_proxy = UserProxyAgent(
    name="User",
    human_input_mode="TERMINATE",
    code_execution_config={"work_dir": "coding"}
)

# Create group chat
groupchat = GroupChat(
    agents=[user_proxy, planner, coder, reviewer],
    messages=[],
    max_round=10
)

manager = GroupChatManager(groupchat=groupchat)

# Start conversation
user_proxy.initiate_chat(
    manager,
    message="Create a web scraper for news articles"
)

CrewAI

Framework for orchestrating role-playing, autonomous AI agents.

Python - CrewAI Implementation
from crewai import Agent, Task, Crew

# Define agents with roles
researcher = Agent(
    role='Senior Research Analyst',
    goal='Uncover cutting-edge developments in AI',
    backstory="You're an expert researcher with keen insight.",
    verbose=True,
    allow_delegation=False
)

writer = Agent(
    role='Tech Content Writer',
    goal='Create engaging content about AI developments',
    backstory="You're a skilled writer who makes complex topics accessible.",
    verbose=True,
    allow_delegation=True
)

# Define tasks
research_task = Task(
    description="Research the latest AI breakthroughs in 2024",
    agent=researcher,
    expected_output="A comprehensive report on AI developments"
)

write_task = Task(
    description="Write a blog post about the research findings",
    agent=writer,
    expected_output="A 1000-word blog post"
)

# Create and run crew
crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, write_task],
    verbose=True
)

result = crew.kickoff()

LangGraph

Build stateful, multi-agent applications with LLMs.

JADE (Java Agent DEvelopment)

FIPA-compliant framework for developing multi-agent systems in Java.

SPADE (Python)

Smart Python Agent Development Environment with XMPP communication.

Real-World Applications

Domain Application Agent Types Key Benefits
Software Development Automated coding teams Architect, Developer, Tester, Reviewer Faster development, comprehensive testing
Supply Chain Logistics optimization Supplier, Manufacturer, Distributor, Retailer Reduced costs, improved efficiency
Smart Cities Traffic management Vehicle, Traffic Light, Route Planner Reduced congestion, lower emissions
Finance Portfolio management Analyst, Trader, Risk Manager Diversified strategies, risk mitigation
Gaming NPC behavior Combat, Dialog, Quest, Merchant Dynamic gameplay, emergent narratives
Research Scientific discovery Hypothesis Generator, Experimenter, Analyzer Accelerated research, novel insights

Building a Multi-Agent System

Complete Example: Customer Service System

Python - Multi-Agent Customer Service
import asyncio
from enum import Enum
from typing import List, Dict, Any
import random

class AgentRole(Enum):
    ROUTER = "router"
    TECHNICAL = "technical_support"
    BILLING = "billing_support"
    SALES = "sales"
    ESCALATION = "escalation_manager"

class ServiceAgent:
    def __init__(self, agent_id: str, role: AgentRole, expertise: List[str]):
        self.id = agent_id
        self.role = role
        self.expertise = expertise
        self.workload = 0
        self.max_workload = 3
        self.active_tickets = []
    
    async def can_handle(self, ticket: Dict[str, Any]) -> float:
        """Calculate confidence score for handling a ticket"""
        if self.workload >= self.max_workload:
            return 0.0
        
        # Calculate expertise match
        ticket_keywords = ticket.get('keywords', [])
        match_score = sum(
            1 for keyword in ticket_keywords 
            if keyword in self.expertise
        ) / max(len(ticket_keywords), 1)
        
        # Adjust for current workload
        availability_score = 1 - (self.workload / self.max_workload)
        
        return match_score * 0.7 + availability_score * 0.3
    
    async def handle_ticket(self, ticket: Dict[str, Any]):
        """Process a customer ticket"""
        self.workload += 1
        self.active_tickets.append(ticket['id'])
        
        print(f"{self.id} handling ticket {ticket['id']}")
        
        # Simulate processing time
        processing_time = random.uniform(2, 5)
        await asyncio.sleep(processing_time)
        
        # Determine if escalation needed
        if random.random() < 0.1:  # 10% escalation rate
            return {"status": "escalate", "reason": "Complex issue"}
        
        self.workload -= 1
        self.active_tickets.remove(ticket['id'])
        
        return {"status": "resolved", "resolution": f"Handled by {self.id}"}

class MultiAgentCustomerService:
    def __init__(self):
        self.agents = self.initialize_agents()
        self.ticket_queue = asyncio.Queue()
        self.metrics = {
            'total_tickets': 0,
            'resolved_tickets': 0,
            'escalated_tickets': 0,
            'avg_resolution_time': 0
        }
    
    def initialize_agents(self) -> List[ServiceAgent]:
        agents = [
            ServiceAgent("ROUTER-1", AgentRole.ROUTER, 
                        ["routing", "classification", "priority"]),
            ServiceAgent("TECH-1", AgentRole.TECHNICAL, 
                        ["software", "hardware", "network", "troubleshooting"]),
            ServiceAgent("TECH-2", AgentRole.TECHNICAL, 
                        ["database", "api", "integration", "performance"]),
            ServiceAgent("BILL-1", AgentRole.BILLING, 
                        ["payment", "invoice", "subscription", "refund"]),
            ServiceAgent("SALES-1", AgentRole.SALES, 
                        ["pricing", "features", "upgrade", "demo"]),
            ServiceAgent("ESC-1", AgentRole.ESCALATION, 
                        ["complaint", "urgent", "vip", "complex"])
        ]
        return agents
    
    async def route_ticket(self, ticket: Dict[str, Any]):
        """Route ticket to best available agent"""
        # Get confidence scores from all agents
        scores = {}
        for agent in self.agents:
            if agent.role != AgentRole.ROUTER:
                score = await agent.can_handle(ticket)
                scores[agent] = score
        
        # Select best agent
        if scores:
            best_agent = max(scores.items(), key=lambda x: x[1])
            if best_agent[1] > 0.3:  # Minimum confidence threshold
                return best_agent[0]
        
        # Fallback to escalation
        return next(
            agent for agent in self.agents 
            if agent.role == AgentRole.ESCALATION
        )
    
    async def process_ticket(self, ticket: Dict[str, Any]):
        """Main ticket processing workflow"""
        self.metrics['total_tickets'] += 1
        
        # Route ticket
        assigned_agent = await self.route_ticket(ticket)
        
        # Handle ticket
        result = await assigned_agent.handle_ticket(ticket)
        
        # Process result
        if result['status'] == 'resolved':
            self.metrics['resolved_tickets'] += 1
        elif result['status'] == 'escalate':
            self.metrics['escalated_tickets'] += 1
            # Re-route to escalation agent
            esc_agent = next(
                agent for agent in self.agents 
                if agent.role == AgentRole.ESCALATION
            )
            await esc_agent.handle_ticket(ticket)
        
        return result
    
    async def monitor_performance(self):
        """Monitor system performance and rebalance if needed"""
        while True:
            await asyncio.sleep(10)
            
            # Check agent workloads
            overloaded = [
                agent for agent in self.agents 
                if agent.workload > agent.max_workload * 0.8
            ]
            
            if overloaded:
                print(f"Warning: {len(overloaded)} agents overloaded")
                # Implement rebalancing logic here
    
    async def run(self):
        """Run the multi-agent system"""
        # Start monitoring
        monitor_task = asyncio.create_task(self.monitor_performance())
        
        # Simulate incoming tickets
        ticket_types = [
            {"id": f"T{i}", "keywords": ["software", "bug"], "priority": "high"},
            {"id": f"T{i}", "keywords": ["payment", "failed"], "priority": "urgent"},
            {"id": f"T{i}", "keywords": ["pricing", "quote"], "priority": "normal"},
        ]
        
        tasks = []
        for i in range(10):
            ticket = random.choice(ticket_types)
            ticket['id'] = f"T{i:03d}"
            task = asyncio.create_task(self.process_ticket(ticket))
            tasks.append(task)
            await asyncio.sleep(1)  # Stagger ticket arrival
        
        # Wait for all tickets to be processed
        await asyncio.gather(*tasks)
        
        # Print metrics
        print("\n=== System Metrics ===")
        print(f"Total Tickets: {self.metrics['total_tickets']}")
        print(f"Resolved: {self.metrics['resolved_tickets']}")
        print(f"Escalated: {self.metrics['escalated_tickets']}")

# Run the system
if __name__ == "__main__":
    system = MultiAgentCustomerService()
    asyncio.run(system.run())

Emergent Behaviors in Multi-Agent Systems

Swarm Intelligence

Simple rules at the individual level leading to complex collective behavior.

Ant Colony Optimization (ACO): 1. Ants explore randomly 2. Deposit pheromones on paths 3. Other ants follow pheromone trails 4. Shorter paths accumulate more pheromones 5. System converges to optimal solution

Emergence Examples

  • Flocking: Birds following simple rules create complex formations
  • Market Dynamics: Individual trades creating price patterns
  • Traffic Flow: Individual driving decisions affecting system flow
  • Social Networks: Local connections forming global structures

Challenges and Solutions

⚠️ Common Challenges

  • Scalability: Performance degradation with many agents
  • Coordination Overhead: Communication costs increase exponentially
  • Deadlocks: Agents waiting for each other indefinitely
  • Convergence: System may not reach stable state
  • Security: Malicious agents disrupting the system
  • Debugging: Complex interactions hard to trace

Solution Strategies

  • Hierarchical Organization: Reduce communication complexity
  • Local Interactions: Limit agent communication radius
  • Timeout Mechanisms: Prevent indefinite waiting
  • Reputation Systems: Track agent reliability
  • Monitoring Tools: Visualize agent interactions
  • Simulation Testing: Test at scale before deployment

Best Practices

✅ Design Guidelines

  • Start Simple: Begin with few agents, add complexity gradually
  • Define Clear Protocols: Establish communication standards early
  • Design for Failure: Assume agents will fail and plan recovery
  • Monitor Everything: Track agent states and interactions
  • Test at Scale: Simulate with many more agents than production
  • Version Compatibility: Handle different agent versions
  • Resource Management: Implement quotas and limits
  • Documentation: Document agent roles and interactions clearly

Future Directions

Emerging Trends

  • LLM-Based Agents: Natural language communication between agents
  • Quantum Multi-Agent Systems: Quantum computing for agent coordination
  • Blockchain Integration: Decentralized trust and consensus
  • Self-Organizing Systems: Agents that restructure autonomously
  • Human-Agent Teams: Seamless collaboration with humans
  • Explainable MAS: Understanding emergent behaviors
  • Ethical Multi-Agent Systems: Fairness and accountability

Research Areas

  • Scalability: Handling millions of agents efficiently
  • Learning: Agents that improve through interaction
  • Robustness: Resilience to attacks and failures
  • Verification: Formal proofs of system properties
  • Heterogeneity: Integrating diverse agent types

Tools and Resources

Development Platforms

  • NetLogo: Multi-agent programmable modeling environment
  • MASON: Multi-agent simulation toolkit
  • Repast: Agent-based modeling and simulation
  • AgentScript: JavaScript framework for agent modeling
  • Mesa: Python agent-based modeling framework

Learning Resources

  • "Multi-Agent Systems" by Wooldridge: Comprehensive textbook
  • AAMAS Conference: International conference on autonomous agents
  • AgentLink: European network for agent-based computing
  • Complexity Explorer: Online courses on complex systems