CoLLAS 2026

Multi-Agent Empowerment and Emergence
of Complex Behavior in Groups

Anonymous Authors

125 agents driven purely by intrinsic motivation. Egoistic empowerment spontaneously organizes the flock into two opposing directional bands.

Abstract

Intrinsic motivations are receiving increasing attention, i.e. behavioral incentives that are not engineered, but emerge from the interaction of an agent with its surroundings. In this work we study the emergence of behaviors driven by one such incentive, empowerment, specifically in the context of more than one agent. We formulate a principled extension of empowerment to the multi-agent setting, and demonstrate its efficient calculation. We observe that this intrinsic motivation gives rise to characteristic modes of group-organization in two qualitatively distinct environments: a pair of agents coupled by a tendon, and a controllable Vicsek flock. This demonstrates the potential of intrinsic motivations such as empowerment to not just drive behavior for only individual agents but also higher levels of behavioral organization at scale.

Method

We extend single-agent empowerment to multi-agent systems by modeling coupled dynamics as a multi-user interference channel. Each agent treats the actions of others as structured noise and solves for its optimal strategy via iterative water-filling (IWF), reaching a Nash equilibrium without explicit coordination.

Multi-agent interference channel diagram

Figure 1. Multi-agent system as an interference channel with N=3 agents. Each agent attempts to transmit its action sequence through the noisy channel to communicate with its future state. The interference generated by other agents affects each agent's channel capacity.

01

Linearize Dynamics

Compute the block-Jacobian sensitivity matrix Ft along the autonomous trajectory for all agent pairs.

02

Iterative Water-Filling

Each agent treats others as interference noise and updates its probing covariance iteratively until the Nash equilibrium is reached.

03

Empowerment-Driven Control

Agents select actions that maximize their own empowerment (egoistic) or another agent's empowerment (altruistic) at every timestep.

Results: Vicsek Flock

A flock of N = 125 agents, each controlling its angular acceleration. Under passive Vicsek dynamics the flock converges to a single heading. Under egoistic empowerment the flock self-organizes into two opposing directional bands.

t = 1 s
Flock at t=1s Angle distribution t=20
t = 50 s
Flock at t=50s Angle distribution t=1000
t = 100 s
Flock at t=100s Angle distribution t=2000
t = 200 s
Flock at t=200s Angle distribution t=4000

Temporal evolution. Top row: spatial configuration (dot = agent, trail = recent path, color = heading angle). Bottom row: heading-angle distribution relative to flock average. The population transitions from random initial headings through filamentary formations into a stable bimodal band configuration.

Average flock empowerment

Average flock empowerment over time. Egoistic agents maintain high empowerment; baseline dynamics cause a steady decline.

Flock order parameter

Flock order parameter over time. Baseline dynamics drive full alignment; egoistic empowerment suppresses consensus.

Key finding: Low order under the egoistic policy does not imply disorder — agents self-organize into two opposing directional bands, a non-trivial global structure emerging without any externally prescribed coordination.

Results: Linked Pendulums

Two pendulums are coupled by an elastic tendon. Each agent controls only its own hinge torque. The empowerment-driven policy produces distinct behavioral regimes depending on the relative strength of each agent.

Egoistic — Collaboration
Altruistic — Assisted agent
Left Up
Left Up
Right Up
Right Up
Both Up
Both Up
Neither Up
Neither Up

Four possible outcomes. When one agent is stronger it dominates (Left Up / Right Up). Comparable strengths can yield cooperation (Both Up) or neither reaching upright.

Egoistic Policy — Each agent maximizes its own empowerment

Egoistic heatmap

Altruistic Policy — Right agent maximizes left agent's empowerment

Altruistic heatmap
Key finding: Switching the optimization target from egoistic to altruistic qualitatively reshapes the reachable outcome landscape, enabling weaker agents to achieve high-empowerment states that are otherwise inaccessible.

Citation

@inproceedings{anonymous2026multiagentempowerment,
  title     = {Multi-Agent Empowerment and Emergence of Complex Behavior in Groups},
  author    = {Anonymous Authors},
  booktitle = {Conference on Lifelong Learning Agents (CoLLAS)},
  year      = {2026},
}