Intrinsic motivations are receiving increasing attention, i.e. behavioral incentives that are not engineered, but emerge from the interaction of an agent with its surroundings. In this work we study the emergence of behaviors driven by one such incentive, empowerment, specifically in the context of more than one agent. We formulate a principled extension of empowerment to the multi-agent setting, and demonstrate its efficient calculation. We observe that this intrinsic motivation gives rise to characteristic modes of group-organization in two qualitatively distinct environments: a pair of agents coupled by a tendon, and a controllable Vicsek flock. This demonstrates the potential of intrinsic motivations such as empowerment to not just drive behavior for only individual agents but also higher levels of behavioral organization at scale.
We extend single-agent empowerment to multi-agent systems by modeling coupled dynamics as a multi-user interference channel. Each agent treats the actions of others as structured noise and solves for its optimal strategy via iterative water-filling (IWF), reaching a Nash equilibrium without explicit coordination.
Figure 1. Multi-agent system as an interference channel with N=3 agents. Each agent attempts to transmit its action sequence through the noisy channel to communicate with its future state. The interference generated by other agents affects each agent's channel capacity.
Compute the block-Jacobian sensitivity matrix Ft along the autonomous trajectory for all agent pairs.
Each agent treats others as interference noise and updates its probing covariance iteratively until the Nash equilibrium is reached.
Agents select actions that maximize their own empowerment (egoistic) or another agent's empowerment (altruistic) at every timestep.
A flock of N = 125 agents, each controlling its angular acceleration. Under passive Vicsek dynamics the flock converges to a single heading. Under egoistic empowerment the flock self-organizes into two opposing directional bands.
Temporal evolution. Top row: spatial configuration (dot = agent, trail = recent path, color = heading angle). Bottom row: heading-angle distribution relative to flock average. The population transitions from random initial headings through filamentary formations into a stable bimodal band configuration.
Average flock empowerment over time. Egoistic agents maintain high empowerment; baseline dynamics cause a steady decline.
Flock order parameter over time. Baseline dynamics drive full alignment; egoistic empowerment suppresses consensus.
Two pendulums are coupled by an elastic tendon. Each agent controls only its own hinge torque. The empowerment-driven policy produces distinct behavioral regimes depending on the relative strength of each agent.
Four possible outcomes. When one agent is stronger it dominates (Left Up / Right Up). Comparable strengths can yield cooperation (Both Up) or neither reaching upright.
@inproceedings{anonymous2026multiagentempowerment,
title = {Multi-Agent Empowerment and Emergence of Complex Behavior in Groups},
author = {Anonymous Authors},
booktitle = {Conference on Lifelong Learning Agents (CoLLAS)},
year = {2026},
}