Wiring the Organization
Agent cards as identity documents, the principal-broker dispatch model, bridge scripts for inter-service communication, and the directive lifecycle from pending through acknowledged, in-progress, to completed.
A fleet of well-designed specialist agents, each with deep expertise and proper memory infrastructure, is still a collection of isolated processes without wiring. Wiring is what turns a collection into an organization: a shared identity system, a message broker that routes work correctly, bridge scripts that connect heterogeneous services, and a directive lifecycle that gives every piece of work a traceable state.
This lesson builds the wiring.
Agent Cards: Identity in the Org
Every agent in the org has a card — a structured document that the broker uses to make routing and authority decisions. The card is the agent's identity document.
@dataclass
class AgentCard:
# Identity
id: str # "coding-agent-01"
name: str # "Coding Agent"
version: str # "2.1.0"
description: str
# Capabilities (what the agent can do)
domains: list[str] # ["coding", "testing", "code-review"]
capabilities: list[str] # ["write_code", "run_tests", "review_pr"]
skills: list[str] # ["feature-team", "quality-team"]
# Authority (what the agent is allowed to do)
authority_tier: int # 1=read-only, 2=write, 3=deploy, 4=production
max_blast_radius: str # "single-repo", "org-wide", "infrastructure"
requires_approval: list[str] # actions that need human sign-off
# Communication
endpoint: str # "http://localhost:9001"
message_schema: str # version of the message protocol
timeout_seconds: int # max time to wait for response
# State (runtime, not stored in card file)
status: str = "offline" # offline|ready|busy|error
current_directive: Optional[str] = None
last_seen: Optional[datetime] = None
Agent Cards are stored in a registry file and loaded by the broker on startup:
# agents/registry.yaml
agents:
- id: coding-agent-01
name: Coding Agent
version: 2.1.0
domains: [coding, testing, code-review]
capabilities: [write_code, run_tests, review_pr, open_pr]
authority_tier: 2
max_blast_radius: single-repo
requires_approval: [deploy_to_production, delete_branch]
endpoint: http://localhost:9001
message_schema: v2
- id: trading-agent-01
name: Trading Agent
version: 1.4.0
domains: [trading, risk, portfolio]
capabilities: [analyze_market, check_positions, alert_risk]
authority_tier: 2
max_blast_radius: single-portfolio
requires_approval: [modify_strategy_params, increase_position_size]
endpoint: http://localhost:9002
message_schema: v2
- id: content-agent-01
name: Content Agent
version: 1.2.0
domains: [content, social, publishing]
capabilities: [write_post, schedule_content, publish_article]
authority_tier: 2
max_blast_radius: content-pipeline
requires_approval: [publish_to_production, modify_brand_voice]
endpoint: http://localhost:9003
message_schema: v2
The Agent Card registry is the org chart of the agent fleet. When a new directive arrives, the broker consults the registry to find an agent with the right domain and available status — without ever sending a message to any agent until routing is decided.
The Principal-Broker Pattern
The broker is the central nervous system. Every message in the org goes through it. No agent sends directly to another agent without broker mediation.
Why? Because direct agent-to-agent communication bypasses:
- Authority checks (can this agent send this type of message to that agent?)
- Audit logging (was this message sent? was it received? what happened?)
- Fan-out coordination (this signal needs to reach multiple agents simultaneously)
- Offline handling (the target agent is down; where does the message go?)
class PrincipalBroker:
def __init__(self, registry: AgentRegistry):
self.registry = registry
self.audit_log = AuditLog()
self.offline_queues: dict[str, list[Message]] = {}
self.routing_rules = RoutingRules()
async def route(self, message: Message) -> RouteResult:
# 1. Validate message schema
self.validate_schema(message)
# 2. Authenticate sender
sender = self.registry.get(message.sender_id)
if not sender:
raise UnknownSender(message.sender_id)
# 3. Authority check — can this sender send this message type?
if not self.check_authority(sender, message.type):
raise AuthorityViolation(
f"{sender.id} cannot send {message.type}"
)
# 4. Deterministic routing — 9 rules, no LLM
target = self.routing_rules.resolve(message, self.registry)
# 5. Log routing decision
self.audit_log.record(
event="route_decided",
message_id=message.id,
from_agent=message.sender_id,
to_agent=target.id,
rule_applied=target.rule_used,
)
# 6. Deliver or queue
if target.agent.status == "offline":
self.enqueue_offline(target.agent.id, message)
return RouteResult(status="queued", target=target.agent.id)
result = await self.deliver(target.agent, message)
return RouteResult(status="delivered", target=target.agent.id)
The routing rules are deterministic — nine explicit rules, evaluated in priority order, with no LLM reasoning:
class RoutingRules:
"""
9 deterministic routing rules.
Rules are evaluated in order; first match wins.
No LLM calls. No ambiguity.
"""
def resolve(self, message: Message, registry: AgentRegistry) -> RouteDecision:
# Rule 1: Explicit target — honor it if authority permits
if message.target_id:
target = registry.get(message.target_id)
if target and self.can_reach(message.sender_id, target):
return RouteDecision(agent=target, rule_used="explicit-target")
# Rule 2: Domain match — find agent with matching domain
domain_agents = registry.by_domain(message.domain)
ready = [a for a in domain_agents if a.status == "ready"]
if ready:
return RouteDecision(agent=ready[0], rule_used="domain-match")
# Rule 3: Capability match — find agent with required capability
if message.required_capability:
capable = registry.by_capability(message.required_capability)
ready = [a for a in capable if a.status == "ready"]
if ready:
return RouteDecision(agent=ready[0], rule_used="capability-match")
# Rule 4: Authority tier — route up the hierarchy for escalations
if message.type == "escalation":
manager = registry.get_manager_for(message.sender_id)
if manager:
return RouteDecision(agent=manager, rule_used="escalation-up")
# Rules 5-9: fan-out, offline-resilient delivery, fallback, etc.
# ...
raise NoRouteFound(f"No route for message {message.id}")
Bridge Scripts: Connecting Heterogeneous Services
Not every service in the org speaks the broker's message protocol. Legacy services use HTTP. Some use Redis pub/sub. Some write to files. Some trigger cron jobs. Bridge scripts translate between these protocols and the broker's message schema.
A bridge script has one rule: it is stateless. It translates and forwards. It never stores state.
# bridges/discord_bridge.py
# Translates Discord messages into broker directives
class DiscordBridge:
def __init__(self, broker: PrincipalBroker, discord_client: DiscordClient):
self.broker = broker
self.discord = discord_client
async def on_discord_message(self, msg: DiscordMessage) -> None:
# Translate Discord message to broker directive
if not msg.content.startswith("!agent"):
return # Not a command, ignore
directive = Directive(
id=generate_id(),
source="discord",
sender_id="human-operator", # Discord messages = human authority
type="task",
domain=self.parse_domain(msg.content),
description=self.parse_description(msg.content),
priority="normal",
created_at=datetime.utcnow(),
)
# Forward to broker — do not store, do not modify
result = await self.broker.route(directive)
# Report back to Discord
await self.discord.reply(msg, f"Directive {directive.id}: {result.status}")
# bridges/cron_bridge.py
# Translates scheduled cron triggers into broker directives
class CronBridge:
def __init__(self, broker: PrincipalBroker):
self.broker = broker
self.schedule = load_schedule("cron/schedule.yaml")
async def trigger(self, job_name: str) -> None:
job = self.schedule.get(job_name)
if not job:
raise UnknownJob(job_name)
directive = Directive(
id=generate_id(),
source="cron",
sender_id="cron-scheduler",
type=job.directive_type,
domain=job.domain,
description=job.description,
priority=job.priority,
created_at=datetime.utcnow(),
)
await self.broker.route(directive)
# No state stored — the directive is in the broker now
The Directive Lifecycle
Every piece of work in the org is a directive with a defined lifecycle. The lifecycle is the audit trail.
pending → acknowledged → in_progress → completed
↘ ↘
rejected failed
↘
escalated
@dataclass
class Directive:
id: str
type: str # "task", "escalation", "query", "event"
domain: str # "coding", "trading", "content"
description: str
sender_id: str
target_id: Optional[str] # None = broker routes automatically
priority: str # "low", "normal", "high", "critical"
created_at: datetime
# Lifecycle state
status: DirectiveStatus = DirectiveStatus.PENDING
acknowledged_at: Optional[datetime] = None
started_at: Optional[datetime] = None
completed_at: Optional[datetime] = None
result: Optional[dict] = None
error: Optional[str] = None
# Audit trail
transitions: list[StatusTransition] = field(default_factory=list)
def transition(self, new_status: DirectiveStatus, actor: str, note: str = ""):
self.transitions.append(StatusTransition(
from_status=self.status,
to_status=new_status,
actor=actor,
timestamp=datetime.utcnow(),
note=note,
))
self.status = new_status
The lifecycle transitions:
pending → acknowledged: The target agent received the directive and accepted it into its queue. If the agent is busy, the directive waits in the broker queue without transitioning.
acknowledged → in_progress: The agent started working on the directive. This is the moment the timer starts — if the agent spends too long in in_progress, the health monitor flags it.
in_progress → completed: The agent finished and returned a result. The audit log records the result.
in_progress → failed: The agent encountered an unrecoverable error. The broker may trigger a retry or escalation depending on the directive type and error classification.
in_progress → escalated: The agent determined the directive requires human review or a higher-authority agent. The broker routes the escalation up the hierarchy.
# Agent-side lifecycle management
async def process_directive(self, directive: Directive) -> None:
# Acknowledge
directive.transition(
DirectiveStatus.ACKNOWLEDGED,
actor=self.agent_id,
note="Added to processing queue"
)
await self.broker.update_directive(directive)
# Start
directive.transition(
DirectiveStatus.IN_PROGRESS,
actor=self.agent_id,
)
await self.broker.update_directive(directive)
try:
result = await self.execute(directive)
directive.result = result
directive.transition(
DirectiveStatus.COMPLETED,
actor=self.agent_id,
)
except EscalationRequired as e:
directive.transition(
DirectiveStatus.ESCALATED,
actor=self.agent_id,
note=str(e)
)
await self.broker.escalate(directive, reason=str(e))
except Exception as e:
directive.transition(
DirectiveStatus.FAILED,
actor=self.agent_id,
note=str(e)
)
await self.broker.update_directive(directive)
The directive lifecycle is the organization's audit trail. At any moment, you can query all directives in in_progress state, sorted by age, and immediately know what every agent is working on and how long it has been working. This is the operational visibility that distinguishes a platform from a collection of scripts.
Putting the Wiring Together
The wiring layer of an agent operations platform has four components:
- Agent Cards — identity documents in a registry; the broker uses them for all routing and authority decisions
- Principal Broker — central message router using deterministic rules; maintains audit log and offline queues
- Bridge Scripts — stateless translators connecting heterogeneous protocols to the broker message schema
- Directive Lifecycle — every piece of work has a traceable state from pending to completed
Together they create a system where every message is auditable, every agent is accountable, and every piece of work has a traceable history from origin to outcome.
Summary
- Agent Cards encode identity, capabilities, authority tier, and endpoint — the broker uses them for all decisions
- The principal-broker routes all messages through deterministic rules, not LLM reasoning
- Bridge scripts translate external protocols (Discord, cron, webhooks) to the broker schema — always stateless
- The directive lifecycle (pending → acknowledged → in_progress → completed) provides full audit trail
- Offline resilience requires explicit queuing — the broker holds directives for offline agents
What's Next
With the org wired, the next lesson covers the decision layer that sits at the top: the CEO agent that triages incoming work, applies authority rules, and decides what to resolve automatically versus what to escalate to the human.