Back to Publications
AGENTS · Research Report

Emergent Social Collaboration in Multi-Agent LLM Systems: A Mars Colony Simulation Study

A two-layer architecture for studying emergent collaborative behavior in multi-agent LLM systems.

Authors
Yevheniy Chuba · ARESA
Institution
YoreAI
Date
2026-04
Status
Pre-Zenodo
3.5 ± 2.8
Strong Friendships
3 / 3
Construction Sites Done
7.3
Unique Workers / Site
$0.007
Cost Ceiling / Session
PDFLive Demo

Abstract

We present a two-layer architecture for studying emergent collaborative behavior in multi-agent LLM systems. The lower layer is a deterministic rule substrate governing agent needs, pairwise bond dynamics, action selection, and a hard cost ceiling. The upper layer is an LLM-powered dialogue and decision module (GPT-4o-mini / Claude Haiku) that produces natural speech but does not drive the emergent social or task statistics.

Reproducible claims in this paper come from the lower layer, which ships as experiments/run.py and runs in ~3 seconds with stdlib Python.

Keywords: Multi-Agent LLMs, Emergent Behavior, GPT-4o-mini, Claude Haiku, Autonomous Agents, Reproducible Rigs


Motivation

Recent work (Park et al., 2023; Wang et al., 2024) shows that LLM-powered agents can develop emergent collaborative patterns when given personalities, contextual awareness, and freedom to act autonomously. The open question is not whether such patterns emerge, but which are artifacts of the specific LLM and which are properties of the agent architecture itself. If every number in a paper depends on which model version was called, results are unverifiable.

Research Questions

  1. Under a shared rule substrate, does a random initial role mix produce reproducible strong-friendship and tension counts?
  2. Does self-organized construction emerge without explicit task assignment — and how many unique workers does a single site attract?
  3. Can the system hold all agents' needs above breach thresholds for the entire session?
  4. What does the work-vs-social action distribution look like under a need-biased role-weighted policy?

System Architecture

Each of the 10 colonists is an autonomous agent with:

  • Role + personality — one of commander / scientist / builder / engineer / miner / medic; traits used by the LLM layer but not by the rule substrate.
  • Needsenergy, social, purpose in [0, 100]. Decay rates (0.020, 0.015, 0.010) per tick.
  • Bonds — symmetric pair score. Gain is scaled by a per-pair affinity multiplier derived from role compatibility plus random jitter; affinities above a pivot of 0.9 gain bond, below lose.
  • Action — one of working / socializing / resting / building / walking, chosen by a need-biased role-weighted selector.

Experimental Setup

  • Platform: Apple M2 Pro, 32 GB RAM, Python 3.13, stdlib only
  • Configuration: 10 agents × 10,000 ticks (≈30 simulated days) × 10 seeds
  • Wall-clock: ≈3 seconds per full run
  • No network calls, no NumPy, no PyTorch

Key Findings

Social dynamics (mean ± std across 10 seeds)

CategoryValue per session
Strong friendships (bond > 50)3.5 ± 2.8
Working relationships (15 < bond ≤ 50)14.6 ± 2.3
Neutral relationships23.3 ± 2.7
Tensions (bond < −15)3.6 ± 1.6

Task coordination

MetricValue
Construction sites completed3 / 3 every session
Unique workers per site (mean)7.3
Stable-needs fraction100%

Action distribution

ActionShare of agent-ticks
Working49.2%
Socializing20.0%
Resting16.7%
Walking10.7%
Building3.5%

Cost envelope

Dialogue events are rate-limited to 100 per session. At a GPT-4o-mini average cost of $7 × 10⁻⁵ per call, the analytic ceiling is $0.007 per session — regardless of prompt content.

Reproduce

cd genass/publications/quarto/mars_colony_collaboration
uv run python experiments/run.py

Outputs land in experiments/results/ (flat JSON) and the paper pulls numbers from data/simulation_results.json.

Live Demo

The full LLM-enabled system runs in the browser at /future/gaming — watch emergent collaboration play out in 3D.

How to cite

Pre-Zenodo · no DOI yet
@techreport{chuba2026emergent,
  title        = {Emergent Social Collaboration in Multi-Agent LLM Systems: A Mars Colony Simulation Study},
  author       = {Chuba, Yevheniy and {ARESA}},
  year         = {2026},
  month        = {04},
  institution  = {YoreAI},
  url          = {https://aresalab.com/publications/mars-colony-collaboration},
  note         = {Accessed via Aresalab}
}
Keywords
Multi-Agent LLMsEmergent BehaviorGPT-4o-miniClaude HaikuAutonomous AgentsReproducible Rigs

Add ?print=1 to the URL to render this publication in print mode.