Skip to content

Scenario Engine

Turn โ€œ11 decisions and doneโ€ into 2000+ decision epics that make people screenshot the dashboard and share it on Twitter.

Status: Design
Authors: OpenSpawn Team
Last Updated: 2026-02-11


  1. Overview & Philosophy
  2. Architecture
  3. SCENARIO.md File Format
  4. Core Concepts
  5. Decision Math
  6. Industry Scenarios
  7. Integration with Deterministic Engine
  8. Dashboard Visualization
  9. Implementation Phases

The current deterministic engine (tools/sandbox/src/deterministic.ts) processes a single order through ~11 decision steps:

COO receives order โ†’ parse 3 tasks โ†’ hire 3 leads (3 decisions) โ†’
delegate 3 tasks (3 decisions) โ†’ workers progress โ†’ complete

Itโ€™s a straight line. Real organizations are a tangle. We need to model that tangle.

BikiniBottom is SimCity for agent organizations. SimCity isnโ€™t fun because a single house gets built. Itโ€™s fun because a thousand things happen simultaneously โ€” traffic jams, power outages, budget crises, zoning disputes โ€” and you watch the city respond. The Scenario Engine is what generates that emergent complexity.

A scenario should feel like watching a real organization work. Not every decision is dramatic โ€” most are routine. But routines compound into patterns, patterns create bottlenecks, bottlenecks force hard choices, and hard choices are where the drama lives.

  1. Scenarios are data, not code. A SCENARIO.md file defines what happens. The engine interprets it. Non-programmers should be able to write scenarios.

  2. Deterministic core, stochastic texture. The engine is a state machine. Random events use seeded PRNGs โ€” same seed, same run. Replay is sacred.

  3. Friction is the feature. Dependencies, contention, review loops, interrupts โ€” these arenโ€™t bugs. Theyโ€™re what makes organizations interesting. The engine should generate realistic friction, not artificial delays.

  4. Visual density matters. Every decision should produce visible activity on the dashboard: a node lights up, a message flies across the org chart, a task moves on the board, a metric ticks. Dead air is death.

  5. 15โ€“30 minutes, not 15 seconds. Scenarios unfold at a pace that rewards watching. Like a timelapse of a city being built, you should be able to sit back and watch the organization work through a complex problem.

  6. Replayable with variation. Same scenario, different seed = different path. Same structure, different emergent behavior. People should want to run it again to see what happens.


โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ SCENARIO.md โ”‚
โ”‚ (template: phases, epics, events, decision weights) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ parse
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Scenario Engine (NEW) โ”‚
โ”‚ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Phase โ”‚ โ”‚ Task โ”‚ โ”‚ Event โ”‚ โ”‚
โ”‚ โ”‚ Manager โ”‚ โ”‚ Generator โ”‚ โ”‚ Scheduler โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ DAG โ”‚ โ”‚ Resource โ”‚ โ”‚ Decision โ”‚ โ”‚
โ”‚ โ”‚ Resolver โ”‚ โ”‚ Allocator โ”‚ โ”‚ Evaluator โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Narrative Engine (branching + flavor text) โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ feeds
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Deterministic Simulation (EXISTING) โ”‚
โ”‚ DeterministicSimulation.runTick() โ€” processes agents โ”‚
โ”‚ per tick, emits SandboxEvents and ACPMessages โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ emits
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Dashboard (EXISTING) โ”‚
โ”‚ Org chart ยท Task board ยท Message stream ยท Metrics โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
ComponentResponsibility
Phase ManagerTracks scenario phase, evaluates phase transitions, unlocks new work
Task GeneratorExpands epic templates into concrete tasks/subtasks on-demand
Event SchedulerFires random and scripted events at appropriate times
DAG ResolverTracks dependencies between tasks, blocks/unblocks as predecessors complete
Resource AllocatorModels agent availability, handles contention, forces prioritization
Decision EvaluatorApplies weighted random outcomes to decision points (reviews, approvals, resource choices)
Narrative EngineGenerates flavor text, tracks branching story arcs, names events for dashboard display

A SCENARIO.md defines a reusable scenario template. Combined with an ORG.md (which defines who works here), it defines what work they do.

# Scenario Name
## Meta
## Phases
## Epics
## Events
## Resources
## Scoring

Scenario identity and configuration parameters.

## Meta
- **Industry:** AI Dev Agency
- **Duration:** 20 minutes
- **Target decisions:** 1500
- **Tick interval:** 800ms
- **Seed:** random
- **Difficulty:** normal
- **Description:** A fast-growing AI agency takes on three client projects
simultaneously while fighting fires, shipping features, and trying not
to burn out the team.
FieldTypeDescription
IndustrystringCategory tag for filtering/grouping
DurationdurationTarget wall-clock runtime (engine adjusts tick pacing)
Target decisionsnumberApproximate decision count (engine calibrates generation)
Tick intervaldurationBase time between ticks (can be overridden by phase)
Seednumber | โ€œrandomโ€PRNG seed for reproducibility
Difficultyeasy | normal | hard | chaosAdjusts event frequency, review rejection rates, resource scarcity
DescriptiontextShown on scenario select screen

Difficulty presets:

DifficultyEvent frequencyReview rejection %Resource scarcityBlock chance
easy1 per 30 ticks5%none5%
normal1 per 15 ticks15%light10%
hard1 per 8 ticks25%heavy20%
chaos1 per 4 ticks35%extreme30%

Phases are the macro-structure of a scenario. Each phase unlocks new epics, changes event probabilities, and may alter the simulationโ€™s tick speed.

## Phases
### Phase 1: Setup (ticks 1โ€“50)
The team assembles. Leads are hired, initial tasks assigned.
Client kickoff meetings happen. Requirements are gathered.
- **Tick range:** 1โ€“50
- **Unlocks epics:** Client Onboarding, Infrastructure Setup
- **Events enabled:** team-sick, requirement-change
- **Tick interval override:** 600ms (fast โ€” setup should feel snappy)
- **Transition:** all "Setup" epics at 80%+ completion
### Phase 2: Sprint 1 (ticks 51โ€“200)
First real work sprint. Multiple workstreams running in parallel.
Dependencies start to bite. First blockers emerge.
- **Tick range:** 51โ€“200
- **Unlocks epics:** API Development, Frontend Build, Security Audit
- **Events enabled:** p0-bug, client-escalation, scope-creep, team-sick
- **Transition:** 3+ epics at "done" status
### Phase 3: Crunch (ticks 201โ€“350)
Deadline approaching. Resource contention peaks. Hard trade-offs.
- **Tick range:** 201โ€“350
- **Unlocks epics:** Launch Prep, Performance Optimization, Documentation
- **Events enabled:** all
- **Difficulty modifier:** +1 (events fire 50% more often)
- **Transition:** "Launch Prep" epic at 100% OR tick 350
### Phase 4: Launch (ticks 351โ€“400)
Ship it. Final reviews, deploy, monitor, celebrate (or patch).
- **Tick range:** 351โ€“400
- **Unlocks epics:** Deployment, Post-Launch Monitoring
- **Events enabled:** deploy-failure, production-bug, client-feedback
- **Transition:** scenario complete

Phase transitions can be:

  • Tick-based: phase starts at tick N regardless
  • Completion-based: phase starts when conditions are met (e.g., โ€œ3 epics doneโ€)
  • Event-based: phase starts when a specific event fires (e.g., โ€œbattle-beginsโ€ event triggers Phase 3)
  • Hybrid: earliest of tick threshold OR completion condition

Epics are templates for large bodies of work. Each epic expands into tasks and subtasks at generation time.

## Epics
### Client Onboarding
- **Phase:** Setup
- **Domain:** operations, engineering
- **Priority:** critical
- **Generates:** 3โ€“5 tasks, 2โ€“4 subtasks each
- **Dependencies:** none (entry point)
- **Description:** New client needs: contract review, environment provisioning,
requirements doc, kickoff meeting, access setup.
#### Task Templates
1. **Contract Review** [finance]
- Review terms โ†’ Negotiate changes โ†’ Final sign-off
- Duration: 4โ€“6 ticks per subtask
- Review required: yes (L7+)
2. **Environment Provisioning** [engineering]
- Create repos โ†’ Set up CI/CD โ†’ Configure staging
- Duration: 3โ€“5 ticks per subtask
- Dependencies: Contract Review (approved)
3. **Requirements Document** [engineering]
- Draft requirements โ†’ Client review โ†’ Revisions โ†’ Sign-off
- Duration: 5โ€“8 ticks per subtask
- Review loop: 1โ€“3 iterations (weighted: 60% pass first time, 30% one revision, 10% two revisions)
4. **Kickoff Meeting** [operations]
- Prepare agenda โ†’ Schedule โ†’ Run meeting โ†’ Distribute notes
- Duration: 2โ€“3 ticks per subtask
- Cross-dept trigger: on completion โ†’ unlock "API Development" epic
5. **Access Setup** [security]
- Create accounts โ†’ Set permissions โ†’ Audit access โ†’ Sign-off
- Duration: 2โ€“4 ticks per subtask
- Dependencies: Contract Review (approved)

Epic template fields:

FieldDescription
PhaseWhich phase unlocks this epic
DomainWhich department(s) own this work
PriorityBase priority (can be elevated by events)
GeneratesTask/subtask count ranges (randomized within range)
DependenciesOther epics or tasks that must complete first
DescriptionContext for flavor text generation

Task template fields:

FieldDescription
[domain]Domain tag in brackets โ€” routes to correct department
Subtask listNamed subtasks in order
DurationTick range per subtask (randomized)
Review requiredWhether this task needs review before โ€œdoneโ€
Review loopHow many iterations of review are expected (weighted distribution)
DependenciesTasks that must complete before this one starts
Cross-dept triggerEvents to fire on completion

Events inject chaos, drama, and realism. They interrupt normal flow and force the organization to respond.

## Events
### p0-bug
- **Type:** interrupt
- **Probability:** 0.08 per tick (during enabled phases)
- **Cooldown:** 20 ticks (can't fire again within)
- **Priority elevation:** critical
- **Effect:**
- Creates 1 critical task: "Fix [random system] outage"
- Pulls senior engineer off current work (preempt)
- Generates 4โ€“6 subtasks: investigate, reproduce, fix, test, deploy, postmortem
- Cross-dept: notify support ("we're aware, ETA incoming")
- Cross-dept: notify client if client-facing
- **Dashboard flavor:** ๐Ÿšจ flashing alert, org chart highlights affected agents in red
- **Narrative:** "[Agent] discovered a critical bug in [system]. All hands on deck."
### client-escalation
- **Type:** interrupt
- **Probability:** 0.05 per tick
- **Cooldown:** 30 ticks
- **Effect:**
- Elevates 1 random in-progress task to critical
- Creates task: "Client sync call" [operations, 3 ticks]
- COO gets escalation message
- If during Phase 3+: also creates "Scope negotiation" task
- **Narrative:** "Client [name] is unhappy with progress on [task]. Emergency meeting called."
### team-sick
- **Type:** disruption
- **Probability:** 0.03 per tick
- **Cooldown:** 40 ticks
- **Duration:** 15โ€“25 ticks
- **Effect:**
- 1 random non-lead agent becomes unavailable
- Their in-progress tasks go to "blocked" (reason: "assignee unavailable")
- Manager must reassign or wait
- If the sick agent is a bottleneck (only person in domain), generates escalation chain
- **Narrative:** "[Agent] is out sick. [Manager] is redistributing their workload."
### scope-creep
- **Type:** expansion
- **Probability:** 0.04 per tick
- **Cooldown:** 25 ticks
- **Effect:**
- Adds 2โ€“4 new tasks to a random in-progress epic
- New tasks have dependencies on existing work
- If during Phase 3+: triggers deadline-pressure event
- **Narrative:** "New requirements just came in: [generated requirement]. Adding to the backlog."
### deadline-pressure
- **Type:** modifier
- **Probability:** 0.0 (triggered by other events or phase transitions)
- **Effect:**
- Reduces tick duration per subtask by 30% (agents work faster but quality drops)
- Review rejection rate increases by 10%
- Morale metric decreases
- After 30 ticks: either deadline met (celebrate) or missed (consequences)
- **Narrative:** "Two weeks until launch. [COO] is cutting scope."
### surprise-opportunity
- **Type:** expansion
- **Probability:** 0.02 per tick
- **Cooldown:** 60 ticks
- **Effect:**
- New client appears with a small engagement
- Creates 1 new epic with 3โ€“4 tasks
- Competes for resources with existing work
- Successful completion awards bonus credits
- **Narrative:** "Inbound lead: [Company] wants a quick prototype. Big potential if we impress them."

Event types:

TypeDescriptionDashboard effect
interruptDemands immediate attention, preempts current work๐Ÿšจ Flash alert, red highlights
disruptionRemoves or modifies resourcesโš ๏ธ Agent goes grey, tasks redistribute
expansionAdds new work to the scenario๐Ÿ“‹ New tasks appear on board
modifierChanges simulation parameters๐Ÿ”ง Metric shifts visible
narrativePure story beat, no mechanical effect๐Ÿ’ฌ Story event in timeline
opportunityOptional beneficial event, but costs resourcesโœจ Gold highlight, optional accept

Resources model scarcity and contention โ€” things agents need but canโ€™t always get.

## Resources
### Senior Engineering Time
- **Type:** agent-hours
- **Pool:** 2 agents ร— 1 task-slot each = 2 concurrent
- **Contention rule:** FIFO with priority override (critical tasks preempt)
- **Starvation alert:** if any task waits > 10 ticks for this resource
### QA Capacity
- **Type:** agent-hours
- **Pool:** 1 agent ร— 2 task-slots = 2 concurrent reviews
- **Bottleneck effect:** when queue > 4, review duration increases 50%
### Client Meeting Slots
- **Type:** calendar
- **Pool:** 2 per phase (client only meets twice per phase)
- **Effect:** tasks requiring client sign-off must wait for a slot
### Budget
- **Type:** credits
- **Pool:** 5000 credits for scenario
- **Burn rate:** ~15 credits/tick during active work
- **Alert:** at 20% remaining, triggers "budget-crunch" event
- **Depleted:** non-critical tasks pause, critical only

How well did the organization perform?

## Scoring
### Dimensions (each 0โ€“100)
- **Velocity:** tasks completed per tick, weighted by priority
- **Quality:** (1 - review rejection rate) ร— 100
- **Efficiency:** credits earned / credits spent ratio
- **Resilience:** how quickly org recovered from events (ticks-to-recover)
- **Morale:** f(overwork, idle-time, escalation-frequency, blocked-duration)
- **Deadline:** % of deadline-sensitive tasks completed on time
### Overall Score
weighted_average(velocity=20, quality=25, efficiency=15, resilience=20, morale=10, deadline=10)
### Grades
- **S:** 90โ€“100 โ€” "Legendary org. Screenshot this."
- **A:** 80โ€“89 โ€” "Well-oiled machine."
- **B:** 70โ€“79 โ€” "Solid. Room to optimize."
- **C:** 60โ€“69 โ€” "Growing pains. Needs restructuring."
- **D:** 50โ€“59 โ€” "Dysfunction junction."
- **F:** <50 โ€” "Total organizational collapse."

Three levels of work decomposition, each with different decision characteristics:

Epic: "Build Payment System" [1 decision: create + delegate to lead]
โ”œโ”€โ”€ Task: "Design API Schema" [3 decisions: create + assign + ack]
โ”‚ โ”œโ”€โ”€ Subtask: "Research payment providers" [7 decisions: assign, ack, progressร—2, review, revise, complete]
โ”‚ โ”œโ”€โ”€ Subtask: "Draft OpenAPI spec" [6 decisions: assign, ack, progress, review, complete, trigger]
โ”‚ โ””โ”€โ”€ Subtask: "Security review of spec" [5 decisions: assign, ack, review, feedback, complete]
โ”œโ”€โ”€ Task: "Implement Backend" [3 decisions: create + assign + ack]
โ”‚ โ”œโ”€โ”€ Subtask: "Stripe integration" [7 decisions]
โ”‚ โ”œโ”€โ”€ Subtask: "Webhook handlers" [6 decisions]
โ”‚ โ”œโ”€โ”€ Subtask: "Transaction ledger" [7 decisions]
โ”‚ โ””โ”€โ”€ Subtask: "Unit tests" [5 decisions]
โ””โ”€โ”€ Task: "Frontend Integration" [3 decisions: blocked until API done โ†’ unblock = 1 decision]
โ”œโ”€โ”€ Subtask: "Payment form component" [6 decisions]
โ”œโ”€โ”€ Subtask: "Checkout flow" [7 decisions]
โ””โ”€โ”€ Subtask: "E2E tests" [5 decisions]

Decision accounting per subtask:

StepDecisionsMessages Generated
Lead creates subtask1โ€”
Lead assigns to worker1delegation ACP
Worker acknowledges1ack ACP
Worker progresses (1โ€“3 updates)1โ€“3progress ACP(s)
Worker submits for review1progress ACP (pct=80)
Reviewer evaluates1โ€”
If revision needed: feedback + rework + resubmit3escalation ACP + progress ACP
Reviewer approves1completion ACP
Worker marks complete1completion ACP
Cross-dept trigger fires (if applicable)1delegation ACP
Total per subtask8โ€“125โ€“8 messages

Tasks form a directed acyclic graph. Dependencies create realistic workflow friction.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Requirements โ”‚
โ”‚ Document โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ–ผ โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ API Spec โ”‚ โ”‚ UX โ”‚ โ”‚ Infra โ”‚
โ”‚ Design โ”‚ โ”‚ Mocks โ”‚ โ”‚ Setup โ”‚
โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜
โ”‚ โ”‚ โ”‚
โ–ผ โ”‚ โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚
โ”‚ Backend โ”‚โ—„โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ Build โ”‚โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”
โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚Frontend โ”‚ โ”‚ Security โ”‚
โ”‚ Build โ”‚ โ”‚ Audit โ”‚
โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ โ”‚
โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Integration โ”‚
โ”‚ Testing โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Deploy โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

DAG mechanics in the engine:

interface TaskDependency {
taskId: string; // the dependent task
dependsOn: string[]; // predecessor task IDs
type:
| "finish-to-start" // predecessor must be done
| "start-to-start" // predecessor must have started
| "partial"; // predecessor at 50%+ triggers start
blockedSince?: number; // tick when this dependency started blocking
}

Dashboard effect: Dependencies show as connecting lines between task cards. Blocked tasks pulse softly. When a predecessor completes, the line turns green and the dependent task lights up โ€” a visible โ€œunlockโ€ animation.

Not all reviews pass. Not all plans survive contact with reality.

### Decision: Code Review
- **Approve (pass):** 70% โ€” task advances to done
- **Request changes (minor):** 20% โ€” task returns to in_progress, 2โ€“3 tick rework
- **Reject (major issues):** 8% โ€” task returns to in_progress, 5โ€“8 tick rework
- **Escalate (out of scope):** 2% โ€” task escalated to manager, possible reassignment
### Decision: Client Sign-Off
- **Approve:** 60%
- **Approve with conditions:** 25% โ€” creates 1โ€“2 new subtasks
- **Request major revisions:** 12% โ€” epic adds 1 new task
- **Reject direction:** 3% โ€” epic resets current phase, 30% work lost
### Decision: Resource Contention
- **First-come-first-served:** 50% โ€” whoever asked first gets the resource
- **Priority override:** 30% โ€” higher priority task preempts
- **Manager intervention:** 15% โ€” manager manually assigns
- **Deadlock:** 5% โ€” both tasks blocked, escalation to COO

Difficulty scaling:

DecisionEasyNormalHardChaos
Review pass rate85%70%55%40%
Client approval80%60%40%25%
Block chance5%10%20%30%
Event frequencylowmediumhighextreme

When work in one department creates work in another โ€” the organizational multiplier.

triggers:
- when: "API Spec Design" completes
then:
- create_task: "Build Frontend Components" in frontend
- create_task: "Write API Documentation" in marketing
- create_task: "Security Review: API Surface" in security
message: "API spec approved. Frontend, docs, and security work auto-created."
- when: any task in "security" rejects review
then:
- block: the reviewed task
- create_task: "Security Remediation" in engineering (critical)
- notify: COO
message: "Security found issues. Engineering must remediate before proceeding."
- when: "Client Onboarding" epic completes
then:
- phase_transition: "Sprint 1"
- create_epic: generate from "Sprint Work" template
- event: "celebration" (narrative)
message: "Client is onboarded! Sprint 1 begins."

The multiplier effect: A single task completion can cascade into 3โ€“5 new tasks across departments. This is how scenarios naturally generate 1000+ decisions โ€” not by having 1000 pre-defined tasks, but by having ~50 tasks that generate more tasks through triggers.

Multiple workstreams competing for scarce resources creates organic drama.

Sprint 1 Sprint 2
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Client A: โ”‚ โ”‚ Client B: โ”‚
โ”‚ Payment System โ”‚ โ”‚ Analytics Dash โ”‚
โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ Needs: Sandy โ”‚โ”€โ”€โ”€โ”€ CONFLICT โ”€โ”€โ”€โ”€โ”‚ Needs: Sandy โ”‚
โ”‚ (3 ticks) โ”‚ โ”‚ (5 ticks) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Resolution options:
1. Sandy works Client A first (B waits 3 ticks)
2. Sandy works Client B first (A waits 5 ticks)
3. Sandy splits time (both take 60% longer)
4. Hire another senior engineer (costs credits, takes 5 ticks to onboard)
5. Escalate to COO for priority call

Contention creates visible drama on the dashboard: You see two task cards pulsing, both wanting the same agent. The agentโ€™s node on the org chart flashes between two colors. Eventually, a manager makes a call, and one task turns red (blocked) while the other turns green (proceeding). Real organizational drama, played out visually.

Events are the heartbeat of scenario drama. Without events, work flows smoothly โ€” and smoothly is boring.

Event scheduling algorithm:

for each tick:
for each enabled event in current phase:
if event.cooldown has elapsed:
roll = prng.next()
adjusted_probability = event.probability ร— difficulty_modifier ร— phase_modifier
if roll < adjusted_probability:
fire_event(event)
set cooldown

Event chaining: Events can trigger other events. A p0-bug during deadline-pressure triggers emergency-triage. A team-sick event when only 1 engineer remains triggers critical-understaffing. This creates emergent narrative arcs that differ each run.

Deadlines arenโ€™t just numbers โ€” they create organizational pressure that changes behavior.

### Deadline: Client Demo (Tick 250)
**Required completions:**
- API endpoints (all critical paths)
- Frontend demo flow (happy path)
- Sample data loaded
**As deadline approaches:**
- Tick 200 (50 remaining): status check, scope assessment
- Tick 220 (30 remaining): if behind, trigger "scope-cut" event
- COO must choose which features to drop
- Dropped features create "deferred" tasks (debt for later)
- Tick 240 (10 remaining): crunch mode
- All non-critical tasks paused
- Available agents reassigned to deadline work
- Review standards relaxed (faster approvals, higher risk)
- Tick 250: evaluation
- If met: celebration event, client satisfaction +20, unlock new epic
- If missed: client-escalation event, reputation hit, recovery tasks created

Phases give scenarios a narrative arc โ€” a beginning, middle, and climax.

Phase mechanics:

interface ScenarioPhase {
id: string;
name: string;
tickRange: [number, number]; // [start, end] โ€” flexible boundaries
tickInterval?: number; // override base tick speed
unlocksEpics: string[]; // epic IDs that become available
enabledEvents: string[]; // event IDs active during this phase
difficultyMod: number; // multiplier on event probability (1.0 = normal)
transition: PhaseTransition; // when does this phase end?
narrative: string; // displayed on phase start
ambientMessages: string[]; // random chatter during this phase
}
type PhaseTransition =
| { type: "tick"; tick: number }
| { type: "completion"; condition: string } // e.g., "3 epics done"
| { type: "event"; eventId: string }
| { type: "hybrid"; tick: number; condition: string }; // whichever comes first

Decisions compound. Each run tells a different story.

โ”Œโ”€โ”€โ”€โ”€ Accept opportunity โ”€โ”€โ”€โ”€โ”
โ”‚ (strain resources) โ”‚
Setup โ”€โ”€โ”€โ”€ Sprint โ”€โ”€โ”€โ”€โ”€โ”ค โ”œโ”€โ”€โ”€โ”€ Resolution
โ”‚ (focus on core work) โ”‚
โ””โ”€โ”€โ”€โ”€ Decline opportunity โ”€โ”€โ”€โ”€โ”˜
If accepted:
โ”œโ”€โ”€ Success: bonus credits, reputation boost, unlocks "Partnership" epic
โ””โ”€โ”€ Failure: missed deadline, client anger, recovery phase inserted
If declined:
โ”œโ”€โ”€ Core work ships on time: solid but unspectacular finish
โ””โ”€โ”€ Competitor takes the opportunity: "what if" narrative beat

Branch tracking: The engine maintains a storyState object โ€” a key-value map that records which branches were taken. Events and phases can check story state to conditionally activate:

### Event: Competitor Wins Client
- **Condition:** storyState["opportunity-declined"] == true
- **Probability:** 0.4 (fires once)
- **Effect:** narrative only โ€” "Meanwhile, [Competitor] landed the [Opportunity] deal."
- **Dashboard:** news ticker shows the missed opportunity

Real-time metrics visible on the dashboard, final score on scenario completion.

Per-tick metrics:

MetricComputationDashboard widget
Active taskscount(status โˆˆ {assigned, in_progress, review})Number badge
Throughputcompleted tasks in last 20 ticksSparkline chart
Message rateACP messages in last 10 ticksPulse indicator
Block rateblocked / total activeColor indicator (green โ†’ red)
Budget burncredits spent / credits totalProgress bar
Agent utilizationbusy agents / total agentsPercentage gauge
Escalation rateescalations / total decisions in last 20 ticksWarning indicator

Final score card:

โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—
โ•‘ SCENARIO COMPLETE: "AI Dev Agency Sprint" โ•‘
โ•‘ โ•‘
โ•‘ Overall Grade: A (84/100) โ•‘
โ•‘ โ•‘
โ•‘ Velocity: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘ 82 โ•‘
โ•‘ Quality: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘ 91 โ•‘
โ•‘ Efficiency: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘ 73 โ•‘
โ•‘ Resilience: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘ 85 โ•‘
โ•‘ Morale: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘ 78 โ•‘
โ•‘ Deadline: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘ 88 โ•‘
โ•‘ โ•‘
โ•‘ Decisions: 1,847 | Ticks: 412 โ•‘
โ•‘ Agents: 24 | Messages: 2,340 โ•‘
โ•‘ Events survived: 14 โ•‘
โ•‘ โ•‘
โ•‘ Story: You took on the extra client and โ•‘
โ•‘ barely made both deadlines. Sandy worked โ•‘
โ•‘ overtime for 40 ticks straight. Patrick โ•‘
โ•‘ surprisingly saved Sprint 2 by fixing the โ•‘
โ•‘ payment bug nobody else could figure out. โ•‘
โ•‘ โ•‘
โ•‘ [Share on Twitter] [Run Again] [Try Hard] โ•‘
โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

How do we guarantee 1000+ decisions from a scenario template?

Given:
P = number of phases (typically 4)
E = epics per phase (typically 3โ€“5)
T = tasks per epic (typically 3โ€“5)
S = subtasks per task (typically 2โ€“4)
D = decisions per subtask (typically 8โ€“12)
Base decisions = P ร— E ร— T ร— S ร— D
Conservative: 4 ร— 3 ร— 3 ร— 2 ร— 8 = 576
Normal: 4 ร— 4 ร— 4 ร— 3 ร— 10 = 1,920
Rich: 4 ร— 5 ร— 5 ร— 4 ร— 12 = 4,800

Additional decisions from organizational friction:

SourceDecisions per occurrenceOccurrences per scenarioTotal
Dependency blocks/unblocks3 (block + reassess + unblock)20โ€“4060โ€“120
Review rejections + rework5 (reject + feedback + rework + resubmit + re-review)15โ€“3075โ€“150
Resource contention4 (conflict + escalation + resolution + reassign)10โ€“2040โ€“80
Cross-dept triggers6 (trigger + create + assign + ack + notify + log)15โ€“2590โ€“150
Hiring/onboarding4 (decide + hire + assign mentor + first task)5โ€“1520โ€“60
Escalation chains6 (escalate + manager review + resolution ร— levels)10โ€“2060โ€“120
Friction total345โ€“680

Each random event generates its own decision tree:

Event typeDecisions generatedFrequency (normal)Total
P0 bug20โ€“30 (investigate + fix + test + deploy + postmortem)2โ€“4 per scenario40โ€“120
Client escalation10โ€“15 (meeting + reprioritize + communicate)3โ€“5 per scenario30โ€“75
Team sick8โ€“12 (reassign + redistribute + backfill)2โ€“4 per scenario16โ€“48
Scope creep15โ€“20 (new tasks + replan + negotiate)2โ€“3 per scenario30โ€“60
Opportunity25โ€“35 (evaluate + accept/decline + execute)1โ€“2 per scenario25โ€“70
Event total141โ€“373
Minimum scenario (easy, 15 min):
Base: 576
Friction: 345
Events: 141
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Total: 1,062 โœ“ (exceeds 1000)
Standard scenario (normal, 20 min):
Base: 1,920
Friction: 500
Events: 250
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Total: 2,670
Rich scenario (hard, 30 min):
Base: 4,800
Friction: 680
Events: 373
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Total: 5,853

The engine self-calibrates: If a scenario is running ahead of its decision target, it reduces event frequency. If itโ€™s running behind, it increases event frequency and adds more review friction. The target is a smooth, consistent pace of ~2โ€“3 decisions per tick.


The meta-demo. OpenClaw showing off what OpenClaw can do.

ORG.md: The existing BikiniBottom org (Mr. Krabs, Sandy, SpongeBob, etc.)

# AI Dev Agency Sprint
## Meta
- **Industry:** AI Dev Agency
- **Duration:** 20 minutes
- **Target decisions:** 1800
- **Difficulty:** normal
- **Description:** BikiniBottom AI takes on two client projects and an internal
platform upgrade simultaneously. Ship features, fight fires, bill hours.
## Phases
### Phase 1: Client Intake (ticks 1โ€“40)
New quarter, new clients. Mr. Krabs smells money.
- **Unlocks epics:** Client Alpha Onboarding, Client Beta Onboarding, Internal: Platform Upgrade
- **Events enabled:** requirement-change
- **Tick interval:** 500ms
- **Transition:** both onboarding epics at 60%+
### Phase 2: Parallel Sprints (ticks 41โ€“200)
Three workstreams, one engineering team. The fun begins.
- **Unlocks epics:** Alpha: Model Evaluation Pipeline, Beta: Prompt Engineering Suite,
Internal: CI/CD Overhaul, Marketing: Case Study
- **Events enabled:** all
- **Transition:** 4+ epics done
### Phase 3: Demo Day Prep (ticks 201โ€“320)
Client Alpha wants a demo. Client Beta wants a different demo. Both next week.
- **Unlocks epics:** Alpha: Demo Environment, Beta: Demo Environment, Cross-Client: Shared Infra
- **Events enabled:** all
- **Difficulty modifier:** 1.5
- **Transition:** both demo epics complete OR tick 320
### Phase 4: Ship & Celebrate (ticks 321โ€“400)
Deploy to production, send invoices, write postmortem.
- **Unlocks epics:** Deployment, Billing, Retrospective
- **Events enabled:** deploy-failure, production-bug, client-feedback
- **Transition:** scenario complete
## Epics
### Client Alpha Onboarding
- **Phase:** Client Intake
- **Domain:** operations, engineering
- **Priority:** high
#### Task Templates
1. **Scope Definition** [operations]
- Review RFP โ†’ Draft SOW โ†’ Client review โ†’ Revisions โ†’ Sign-off
- Review loop: 1โ€“2 iterations
- Cross-dept: on sign-off โ†’ unlock "Model Evaluation Pipeline"
2. **Environment Setup** [engineering]
- Provision GPU cluster โ†’ Configure model registry โ†’ Set up eval harness โ†’ Test pipeline
- Dependencies: Scope Definition (signed)
3. **Data Pipeline** [engineering]
- Audit client data โ†’ Build ingestion pipeline โ†’ Validate transforms โ†’ Load test
- Duration: 4โ€“7 ticks per subtask
### Alpha: Model Evaluation Pipeline
- **Phase:** Parallel Sprints
- **Domain:** engineering
- **Priority:** critical
#### Task Templates
1. **Eval Framework** [backend]
- Design eval metrics โ†’ Implement scoring โ†’ Build comparison UI โ†’ Backtest
- Cross-dept: on completion โ†’ create "Write Eval Methodology" in marketing
2. **Model Integration** [backend]
- Integrate OpenAI API โ†’ Integrate Anthropic API โ†’ Integrate local models โ†’ A/B test harness
- Duration: 3โ€“5 ticks per subtask
- Dependencies: Eval Framework (50%+)
3. **Client Dashboard** [frontend]
- Model comparison view โ†’ Cost tracking โ†’ Latency charts โ†’ Export reports
- Dependencies: Eval Framework (complete), Model Integration (started)
- Cross-dept: on completion โ†’ Security Review
4. **Prompt Optimization** [engineering]
- Baseline prompts โ†’ Systematic variation โ†’ Eval runs (ร—10) โ†’ Report best performers
- This task generates 10 sub-subtasks (one per eval run), each a mini-decision
- Duration: 2โ€“3 ticks per eval run
### Internal: Platform Upgrade
- **Phase:** Parallel Sprints
- **Domain:** engineering, security
- **Priority:** normal (but competes for resources with client work)
#### Task Templates
1. **Dependency Audit** [security]
- Scan packages โ†’ Flag CVEs โ†’ Prioritize fixes โ†’ Document exceptions
2. **Upgrade Core** [backend]
- Upgrade runtime โ†’ Update dependencies โ†’ Migration scripts โ†’ Integration tests
3. **Performance Baseline** [backend]
- Benchmark before โ†’ Optimize bottlenecks โ†’ Benchmark after โ†’ Report
## Events
### model-api-outage
- **Type:** interrupt
- **Probability:** 0.06 per tick
- **Cooldown:** 30 ticks
- **Effect:**
- All model-related tasks blocked for 5โ€“10 ticks
- Sandy must build fallback to local models (creates 2 emergency subtasks)
- Client Alpha notified (creates comms task)
- **Narrative:** "๐Ÿ”ฅ OpenAI API is down. Eval pipeline halted. Sandy is wiring up local fallbacks."
### billing-dispute
- **Type:** interrupt
- **Probability:** 0.03 per tick
- **Effect:**
- Squilliam creates "Billing Reconciliation" task (finance)
- Mr. Krabs personally involved (his tasks paused for 5 ticks)
- If unresolved in 15 ticks: client-escalation trigger
- **Narrative:** "Client Beta is disputing last month's GPU charges. Mr. Krabs is NOT happy."
### intern-breaks-prod
- **Type:** interrupt
- **Probability:** 0.04 per tick (only once per scenario)
- **Effect:**
- Plankton Jr. accidentally pushes to production
- Creates critical "Rollback & Fix" task
- Karen initiates security audit of deploy permissions
- Generates 8 decisions: rollback, investigate, fix, review, redeploy, postmortem, update-perms, document
- **Narrative:** "๐Ÿฆ  Plankton Jr. pushed to prod. Again. Karen is adding deploy gates."
## Resources
### Senior Engineering Time
- **Pool:** SpongeBob + Patrick = 2 concurrent critical tasks
- **Contention:** Client Alpha vs Client Beta vs Internal
- **Starvation:** if any critical path waits > 8 ticks
### QA Capacity
- **Pool:** Gary = 1 agent, 2 task slots
- **Bottleneck:** Gary is the only QA. Everything funnels through Gary. ๐ŸŒ
### GPU Budget
- **Pool:** 500 compute credits
- **Burn:** eval runs cost 5 credits each, training costs 20
- **Depleted:** eval work pauses, must negotiate with Mr. Krabs for more budget

THE viral scenario. Two rival reef civilizations in an all-out underwater war for territorial dominance. This is BikiniBottomโ€™s brand moment.

Why this scenario goes viral:

  • The org chart IS the military command structure โ€” watching generals coordinate is inherently dramatic
  • Messages between scouts and commanders feel like intercepted military communications
  • The fog of war mechanic means decisions are made with incomplete information
  • Two full org charts competing against each other โ€” double the visual activity
  • People will root for their reef. Theyโ€™ll tweet โ€œCORAL REEF IS WINNINGโ€ with dashboard screenshots
  • The phases (reconnaissance โ†’ skirmish โ†’ battle โ†’ resolution) create a natural story arc with escalating tension

ORG.md: Coral Reef Alliance ๐Ÿชธ

# Coral Reef Alliance
## Identity
The Coral Reef Alliance โ€” defenders of the Great Reef.
A militaristic organization fighting to protect their territory
from the Kelp Forest Dominion's expansion.
- **Industry:** Military
- **Stage:** Active conflict
- **Values:** Defend the reef, protect civilians, strategic superiority
## Culture
preset: military
- **Escalation:** immediate โ€” lives are at stake
- **Progress updates:** every tick โ€” full situational awareness
- **Ack required:** yes โ€” no order goes unconfirmed
## Structure
### Admiral Nautilus โ€” Commander-in-Chief ๐Ÿš
Supreme military commander of the Coral Reef Alliance.
Receives intelligence, makes strategic decisions, allocates forces.
Old, wise, cautious. Prefers siege warfare over direct assault.
- **Avatar:** ๐Ÿš
- **Domain:** Command
- **Reports to:** The Reef Council (Human Principal)
### Intelligence Division
Eyes and ears of the Alliance. Scouts, spies, signal interceptors.
#### Commander Eel โ€” Intelligence Lead ๐Ÿ
Runs the spy network. Processes raw intel into actionable briefings.
- **Avatar:** ๐Ÿ
- **Domain:** Intelligence
#### Scout Fish Alpha โ€” Field Scout ๐ŸŸ
Fast, expendable, observant. Maps enemy positions.
- **Avatar:** ๐ŸŸ
- **Domain:** Reconnaissance
- **Count:** 3
#### Octopus Agent โ€” Spy ๐Ÿ™
Deep cover agent in enemy territory. High-value, high-risk.
- **Avatar:** ๐Ÿ™
- **Domain:** Espionage
### Battle Division
The fighting force. Organized in strike groups.
#### General Mantis Shrimp โ€” Battle Commander ๐Ÿฆ
Hits harder than anything in the ocean. Commands all combat operations.
Aggressive, decisive, impatient with cautious strategies.
- **Avatar:** ๐Ÿฆ
- **Domain:** Combat
#### Captain Barracuda โ€” Strike Group Alpha Lead ๐Ÿก
Fast assault specialist. Commands the primary attack force.
- **Avatar:** ๐Ÿก
- **Domain:** Assault
#### Warrior Crab โ€” Heavy Infantry ๐Ÿฆ€
Armored frontline fighters. Slow but nearly indestructible.
- **Avatar:** ๐Ÿฆ€
- **Domain:** Infantry
- **Count:** 4
#### Jellyfish Swarm โ€” Area Denial ๐Ÿชผ
Deploys stinging formations to control chokepoints.
- **Avatar:** ๐Ÿชผ
- **Domain:** Area Control
- **Count:** 2
### Engineering Corps
Builders and defenders. Coral fortifications, traps, supply routes.
#### Chief Engineer Turtle โ€” Engineering Lead ๐Ÿข
Slow and steady. Builds the reef's defenses. Every wall is a masterpiece.
- **Avatar:** ๐Ÿข
- **Domain:** Fortification
#### Coral Builder โ€” Construction Worker ๐Ÿชธ
Grows and shapes coral into defensive walls, watchtowers, and bunkers.
- **Avatar:** ๐Ÿชธ
- **Domain:** Construction
- **Count:** 3
#### Trap Specialist Pufferfish โ€” Combat Engineer ๐Ÿก
Designs and deploys underwater mines, net traps, and ink clouds.
- **Avatar:** ๐Ÿก
- **Domain:** Traps
### Supply Corps
Keeps the army fed, armed, and moving.
#### Quartermaster Whale โ€” Supply Lead ๐Ÿ‹
Manages logistics. Moves massive quantities of kelp rations
and shell ammunition across the reef.
- **Avatar:** ๐Ÿ‹
- **Domain:** Logistics
#### Supply Runner โ€” Transport ๐Ÿ 
Fast swimmers carrying supplies to front lines.
- **Avatar:** ๐Ÿ 
- **Domain:** Transport
- **Count:** 3
### Diplomatic Corps
War isn't just fought with claws. Alliances, treaties, intelligence sharing.
#### Ambassador Dolphin โ€” Diplomatic Lead ๐Ÿฌ
Charming, intelligent, and politically savvy. Negotiates alliances
with neutral reefs. Manages propaganda and morale.
- **Avatar:** ๐Ÿฌ
- **Domain:** Diplomacy
#### Messenger Seahorse โ€” Diplomatic Courier ๐Ÿด
Carries sealed messages between allied reefs. Small, fast, discreet.
- **Avatar:** ๐Ÿด
- **Domain:** Communications
- **Count:** 2
### Medical Corps
Keeps fighters in the fight. Triage, recovery, morale.
#### Dr. Anemone โ€” Chief Medical Officer ๐ŸŒบ
Field hospital commander. Pragmatic healer. "I can't fix stupid,
but I can fix the damage stupid causes."
- **Avatar:** ๐ŸŒบ
- **Domain:** Medical
#### Medic Cleaner Fish โ€” Field Medic ๐ŸŸ
Front-line medical support. Quick treatment under fire.
- **Avatar:** ๐ŸŸ
- **Domain:** Field Medicine
- **Count:** 2

SCENARIO.md:

# Ocean Reef War: The Battle for the Abyssal Trench
## Meta
- **Industry:** Ocean Reef War
- **Duration:** 30 minutes
- **Target decisions:** 2500
- **Difficulty:** hard
- **Seed:** random
- **Description:** The Kelp Forest Dominion is expanding toward the Great Reef.
The Coral Reef Alliance must defend their territory through intelligence,
fortification, combat, diplomacy, and supply chain management.
Two full organizations clash in a 5-phase war that escalates from
reconnaissance to full-scale battle.
- **Mode:** adversarial (two orgs, AI-vs-AI)
## World State
### Territory Map (6ร—6 grid)

A B C D E F 1 [๐Ÿชธ][๐Ÿชธ][๐Ÿชธ][ ? ][ ? ][ ? ] 2 [๐Ÿชธ][๐Ÿชธ][ ? ][ ? ][ ? ][ ? ] 3 [๐Ÿชธ][ ? ][ ? ][ ? ][ ? ][๐ŸŒฟ] 4 [ ? ][ ? ][ ? ][ ? ][๐ŸŒฟ][๐ŸŒฟ] 5 [ ? ][ ? ][ ? ][๐ŸŒฟ][๐ŸŒฟ][๐ŸŒฟ] 6 [ ? ][ ? ][๐ŸŒฟ][๐ŸŒฟ][๐ŸŒฟ][๐ŸŒฟ]

- ๐Ÿชธ = Coral Reef Alliance territory (known)
- ๐ŸŒฟ = Kelp Forest Dominion territory (known)
- ? = Unexplored / Fog of War
- The Abyssal Trench runs diagonally C3โ†’D4 (key strategic chokepoint)
### Resources
- **Kelp Rations:** 500 units (feeds army; -2/tick per active combat unit)
- **Shell Ammunition:** 300 units (-1/tick per combat unit in battle)
- **Coral Building Material:** 200 units (fortifications cost 10โ€“30 each)
- **Intel Points:** 0 (gained by scouts, spent on strategic decisions)
- **Morale:** 80/100 (drops on losses, rises on victories and rations)
- **Alliance Points:** 0/100 (diplomatic progress toward neutral reef alliance)
## Phases
### Phase 1: Reconnaissance (ticks 1โ€“60)
The fog of war is thick. Both sides send scouts to map enemy positions.
Intelligence flows in fragments. Every revealed tile changes the strategic picture.
- **Unlocks epics:** Scout Deployment, Early Fortification, Supply Chain Setup, Diplomatic Outreach
- **Events enabled:** scout-ambush, false-intel, neutral-reef-contact, resource-discovery
- **Tick interval:** 700ms
- **Ambient:** scouts reporting coordinates, engineers discussing where to build walls,
supply runners inventorying rations
- **Transition:** 60%+ of map revealed OR tick 60
- **Music mood:** tense, quiet, anticipatory
#### Fog of War Mechanic
Each scout mission reveals 1โ€“2 tiles. Some reveal:
- Empty water (safe to traverse)
- Enemy outpost (immediate escalation to intelligence)
- Resource cache (kelp field, shell deposit โ€” creates "secure resource" task)
- Neutral reef (diplomatic opportunity โ€” creates diplomacy task)
- Ambush! (scout captured or killed โ€” intelligence loss)
Intel flows: Scout โ†’ Commander Eel (analysis, 2 ticks) โ†’ Admiral Nautilus (decision, 1 tick).
Raw intel is unreliable: 15% chance of false information. Commander Eel can cross-reference
multiple scout reports to verify (costs 3 ticks but eliminates false intel).
### Phase 2: Fortification & Positioning (ticks 61โ€“140)
Both sides know the map. Now they're digging in and preparing.
Engineers build walls. Supply chains are established. Diplomatic missions intensify.
- **Unlocks epics:** Defensive Line Construction, Forward Base, Alliance Negotiations,
Supply Route Optimization, Spy Infiltration
- **Events enabled:** all reconnaissance events + sabotage, supply-raid, desertion,
diplomatic-incident, weather-current-shift
- **Tick interval:** 800ms
- **Transition:** either side initiates aggression OR tick 140
#### Fortification Tasks
Each fortification is a mini-project:
- Survey location (scout, 2 ticks)
- Design structure (engineer, 3 ticks)
- Gather materials (supply, 4 ticks, costs coral resources)
- Build fortification (3 builders, 6 ticks)
- Install traps (trap specialist, 3 ticks)
- Garrison troops (battle division, ongoing)
A defensive line of 3 fortifications = 90+ decisions just from building.
### Phase 3: First Blood โ€” Skirmishes (ticks 141โ€“220)
Contact. Small engagements at the borders. Probing attacks.
Each skirmish generates tactical decisions and cascading consequences.
- **Unlocks epics:** Border Skirmish Alpha, Border Skirmish Beta, Casualty Management,
Propaganda Campaign, Emergency Resupply
- **Events enabled:** all + ambush, flanking-maneuver, morale-break, heroic-stand,
enemy-surrender, war-crime-report
- **Difficulty modifier:** 1.3
- **Tick interval:** 900ms
- **Transition:** cumulative casualties > threshold OR major victory event
#### Skirmish Mechanics
Each skirmish is a mini-scenario within the scenario:
  1. Detection (scout reports enemy movement) [3 decisions]
  2. Intel Assessment (Commander Eel evaluates threat) [2 decisions]
  3. Admiral Decision: engage / defend / retreat [1 decision, branching]
  4. Force Allocation (General assigns units) [4 decisions]
  5. Supply Check (Quartermaster confirms ammo/rations) [2 decisions]
  6. Engagement (3โ€“8 ticks of combat, decisions per tick) [15โ€“40 decisions]
    • Each tick: advance/hold/retreat per unit
    • Flanking opportunities (spend reserves?)
    • Casualty reports โ†’ medical dispatch
    • Ammo depletion โ†’ resupply request
    • Morale checks (hold or break?)
  7. Aftermath [8 decisions]
    • Casualty triage (medics)
    • Territory assessment (gained/lost/held)
    • Intel from captured enemies
    • Report to Admiral
    • Propaganda (spin the story for morale)

Total per skirmish: 35โ€“60 decisions ร— 4โ€“6 skirmishes in Phase 3 = 140โ€“360 decisions

### Phase 4: The Battle of the Abyssal Trench (ticks 221โ€“340)
Full-scale war. Both sides commit everything. The Abyssal Trench is the prize.
This is the visual climax โ€” the dashboard should be on fire.
- **Unlocks epics:** Grand Assault Plan, Trench Defense, Naval Blockade,
Alliance Reinforcements (if diplomatic success), Last Resort Weapons,
Civilian Evacuation
- **Events enabled:** all + critical-supply-failure, betrayal, secret-weapon,
natural-disaster (whirlpool), heroic-sacrifice, turning-point
- **Difficulty modifier:** 2.0
- **Tick interval:** 1000ms (slower ticks, more happens per tick)
- **Transition:** one side controls the Trench OR tick 340
#### The Grand Battle
Multiple simultaneous engagements:
- **Main assault** on the Trench (20+ agents involved)
- **Flanking maneuver** through the Deep Caves (risky, high reward)
- **Naval blockade** cutting enemy supply lines
- **Spy operation** to sabotage enemy command
- **Diplomatic emergency** โ€” convince neutral reef to intervene
Each of these is a concurrent epic generating 50โ€“100 decisions.
The dashboard shows ALL of them happening at once โ€” org chart ablaze
with activity, messages flying in every direction, resources depleting,
morale fluctuating, territory map updating tile by tile.
### Phase 5: Resolution (ticks 341โ€“400)
The battle is decided. Now comes the aftermath.
- **Unlocks epics:** Ceasefire Negotiation, Territory Settlement,
Casualty Accounting, War Memorial, Post-War Reconstruction
- **Events enabled:** peace-offer, rebellion, refugee-crisis, war-hero-ceremony
- **Tick interval:** 600ms (denouement is faster)
- **Transition:** scenario complete
## Epics
### Scout Deployment
- **Phase:** Reconnaissance
- **Domain:** Intelligence
- **Priority:** critical
#### Task Templates
1. **Deploy Scout Team Alpha** [reconnaissance]
- Brief scouts โ†’ Deploy to sectors B3,C3,D3 โ†’ Await reports โ†’ Analyze
- Each sector reveal = 1 subtask with fog-of-war outcome
- Duration: 3โ€“5 ticks per sector
2. **Deploy Scout Team Beta** [reconnaissance]
- Brief scouts โ†’ Deploy to sectors D2,E2,E3 โ†’ Await reports โ†’ Analyze
- Duration: 3โ€“5 ticks per sector
3. **Deep Reconnaissance** [espionage]
- Brief Octopus Agent โ†’ Infiltrate enemy territory โ†’ Map command structure โ†’ Extract
- Duration: 8โ€“12 ticks total (high risk)
- Decision: if detected, fight-or-flee (weighted: 30% escape clean, 40% escape with intel lost,
20% captured, 10% heroic intelligence coup)
- Cross-dept: on success โ†’ unlock "Spy Infiltration" epic in Phase 2
### Defensive Line Construction
- **Phase:** Fortification & Positioning
- **Domain:** Fortification, Construction
- **Priority:** high
#### Task Templates
1. **Trench Outer Wall** [construction]
- Survey C2 โ†’ Design coral barrier โ†’ Gather 30 coral โ†’ Build wall (6 ticks) โ†’ Install spike traps
- Resource cost: 30 coral, 10 shell (for trap spikes)
- Cross-dept: on completion โ†’ unlock garrison assignment (battle division)
2. **Watchtower at B3** [construction]
- Survey โ†’ Design โ†’ Gather 15 coral โ†’ Build tower (4 ticks) โ†’ Post lookout
- Provides: +1 scout range (adjacent tiles auto-revealed)
- Resource cost: 15 coral
3. **Minefield at D4** [traps]
- Design mine pattern โ†’ Craft 20 sea mines โ†’ Deploy pattern โ†’ Map safe paths for allies
- Resource cost: 20 shell
- Risk: 10% chance of premature detonation during deployment (creates casualty event)
### Grand Assault Plan
- **Phase:** Battle of the Abyssal Trench
- **Domain:** Combat, Command
- **Priority:** critical
#### Task Templates
1. **Strategic Planning** [command]
- Admiral reviews intel โ†’ War council (all division leads) โ†’ Choose strategy โ†’ Approve plan
- Decision: 3 strategy options (weighted by intel quality + resource state):
a) **Frontal Assault** (70% win if resources > 60%, 30% otherwise; high casualties)
b) **Pincer Movement** (60% win; requires successful flanking epic; moderate casualties)
c) **Siege & Starve** (80% win but takes 40+ more ticks; low casualties; risks enemy breakout)
- Branch: chosen strategy changes which sub-epics unlock
2. **Force Deployment** [combat]
- Assign units to positions โ†’ Distribute ammo โ†’ Final supply check โ†’ Confirm readiness
- Creates 1 subtask per combat unit (8โ€“12 subtasks)
- Duration: 1โ€“2 ticks each
3. **Execute Assault** [combat]
- The big one. 20โ€“40 ticks of active battle.
- Each tick: 2โ€“4 decisions (unit movements, engagement calls, resupply requests)
- Special moments (randomly triggered):
- "Heroic Stand" โ€” one unit holds against overwhelming odds (+20 morale)
- "Critical Failure" โ€” key fortification falls (-15 morale, creates emergency)
- "Turning Point" โ€” enemy commander makes a mistake (exploit or not?)
- "Betrayal" โ€” if alliance was tenuous, ally might switch sides (devastating)
## Events
### scout-ambush
- **Type:** interrupt
- **Probability:** 0.10 per tick (Recon phase), 0.05 (other phases)
- **Cooldown:** 15 ticks
- **Effect:**
- 1 scout captured or eliminated
- Intel for that sector lost or corrupted
- Commander Eel must decide: send rescue mission (risky, 3 tasks) or write off the scout
- If rescued: morale +5, scout provides bonus intel
- If lost: morale -5, that sector stays in fog
- **Narrative:** "Scout Fish Alpha-2 has gone silent in sector D3. Last transmission was garbled."
### supply-raid
- **Type:** interrupt
- **Probability:** 0.06 per tick (Phase 2+)
- **Cooldown:** 25 ticks
- **Effect:**
- Enemy raids a supply route
- Lose 20โ€“50 rations OR 10โ€“30 ammo (random)
- Supply Runner may be captured
- Quartermaster must reroute supplies (creates 3 tasks)
- If this is the 3rd supply raid: triggers "supply-crisis" cascading event
- **Narrative:** "๐Ÿšจ Supply convoy ambushed in sector C4! 40 kelp rations lost. Quartermaster Whale is rerouting."
### shifting-alliance
- **Type:** narrative
- **Probability:** 0.03 per tick (Phase 2+)
- **Cooldown:** 40 ticks
- **Condition:** Alliance Points > 30
- **Effect:**
- Neutral reef makes a demand: "Send 100 rations as tribute or we ally with the enemy"
- Admiral must decide: pay (lose rations), negotiate (Ambassador task, 50% success), or refuse
- Accept: Alliance Points +30
- Negotiate success: Alliance Points +20, tribute reduced to 50
- Negotiate fail: Alliance Points -10
- Refuse: Alliance Points -20, risk neutral reef joining enemy
- **Narrative:** "๐Ÿฌ The Deep Reef Confederation demands tribute. Ambassador Dolphin is drafting a counter-offer."
### natural-disaster-whirlpool
- **Type:** disruption
- **Probability:** 0.02 per tick (Phase 3+)
- **Cooldown:** 100 ticks (once per scenario essentially)
- **Effect:**
- Whirlpool forms at random map tile
- Any units/fortifications in adjacent tiles: 30% damaged, 10% destroyed
- Both sides affected โ€” temporary ceasefire (5 ticks)
- Creates emergency tasks: rescue trapped units, repair fortifications
- Territory may shift (tiles revert to unexplored)
- **Narrative:** "๐ŸŒŠ WHIRLPOOL at D3! Both sides scrambling. Ceasefire declared while the ocean rearranges itself."
- **Dashboard:** map tiles swirl, affected units flash, then new layout revealed
### heroic-sacrifice
- **Type:** narrative
- **Probability:** 0.0 (triggered only during Phase 4 battle when morale < 40)
- **Effect:**
- One warrior unit volunteers for a suicide mission
- If accepted: that unit is lost, but deals massive damage to enemy position
- Morale +25 (the sacrifice inspires the troops)
- Creates "Memorial" task in Resolution phase
- Unlocks special ending: "Victory through sacrifice"
- **Narrative:** "Warrior Crab-3 volunteers for the impossible mission. 'For the Reef.' ๐Ÿฆ€๐Ÿ’€"
- **Dashboard:** The sacrificing agent's node pulses gold before fading to grey. A star icon appears on the territory map where they fell.
### secret-weapon
- **Type:** opportunity
- **Probability:** 0.0 (triggered at tick 250 if Engineering Corps has completed 80%+ of construction tasks)
- **Effect:**
- Chief Engineer Turtle reveals a prototype: the "Sonic Coral Cannon"
- Building it: 3 tasks, 15 ticks, costs 50 coral + 30 shell
- If built: can be deployed once โ€” clears an entire map tile of enemy forces
- Game-changing but expensive. Admiral must weigh resource cost vs. tactical advantage.
- **Narrative:** "๐Ÿข Chief Engineer Turtle has been working on something in secret.
'It's not pretty,' he says, 'but it'll change the war.' The Sonic Coral Cannon prototype is ready for review."
## Resources
### Kelp Rations
- **Starting:** 500
- **Burn rate:** 2/tick per active combat unit, 0.5/tick per non-combat agent
- **Resupply:** Supply Corps can create "Kelp Farming" task (generates 50 rations, takes 10 ticks)
- **Depleted:** Morale drops 5/tick, combat effectiveness halved
- **Dashboard:** Green bar, turns yellow at 30%, red at 15%
### Shell Ammunition
- **Starting:** 300
- **Burn rate:** 1/tick per unit in active combat
- **Resupply:** Supply Corps "Shell Gathering" task (generates 30 ammo, 8 ticks)
- **Depleted:** Combat units can only defend (no attacks)
- **Dashboard:** Orange bar with shell icons
### Coral Building Material
- **Starting:** 200
- **Burn rate:** only consumed by construction tasks
- **Resupply:** Engineering Corps "Coral Cultivation" task (generates 25 coral, 12 ticks)
- **Depleted:** No new fortifications
- **Dashboard:** Pink bar with coral icons
### Morale
- **Starting:** 80/100
- **Modifiers:**
- Victory in skirmish: +10
- Loss in skirmish: -15
- Scout lost: -5
- Heroic moment: +10โ€“25
- Rations depleted: -5/tick
- Alliance secured: +15
- Betrayal: -30
- **Below 30:** units may desert (random check each tick)
- **Below 15:** organizational collapse โ€” scenario ends in defeat
- **Dashboard:** Animated morale meter with soldier icons. At high morale, soldiers cheer. At low morale, they look defeated.
## Scoring
### Dimensions
- **Territory Control:** % of map tiles held at resolution
- **Force Preservation:** % of starting forces still active
- **Resource Efficiency:** resources remaining / resources consumed ratio
- **Intel Accuracy:** correct intel / total intel received
- **Diplomatic Success:** alliance points achieved
- **Speed:** ticks to reach resolution (fewer = better)
- **Morale:** final morale score
### Special Achievements
- ๐Ÿ† **Flawless Victory:** Won with 0 units lost
- ๐Ÿ•ต๏ธ **Spymaster:** Every intel report was verified correct
- ๐Ÿค **Diplomat:** Secured alliance without paying tribute
- โšก **Blitzkrieg:** Won in under 300 ticks
- ๐Ÿข **Fortress:** Won without losing a single fortification
- ๐Ÿ’€ **Pyrrhic Victory:** Won but with < 20% forces remaining
- ๐ŸŒŠ **Survived the Whirlpool:** Recovered from natural disaster with no losses
- ๐Ÿฆ€ **Remember Warrior Crab-3:** Won after triggering heroic sacrifice
### Twitter Card
On completion, generate a shareable summary card:

๐Ÿ โš”๏ธ OCEAN REEF WAR โ€” BATTLE COMPLETE

๐Ÿชธ Coral Reef Alliance: VICTORY

Territory: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘ 82% Forces: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘ 64% Morale: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘ 87%

Grade: A (86/100) Achievements: ๐Ÿ•ต๏ธ๐Ÿค

โ€œThe Sonic Coral Cannon fired once. That was enough.โ€

Decisions: 2,847 | Agents: 42 #BikiniBottom #OceanReefWar

The second ORG (enemy) โ€” Kelp Forest Dominion is auto-generated by mirroring the Coral Reef org with different names, flavors, and slight tactical biases (more aggressive, fewer diplomats, more combat units). The engine runs both orgs simultaneously, with decisions from one affecting the other through the shared territory map.


Every case is a branching tree. Perfect for the dependency engine.

# Legal Tech Firm: Quarterly Docket
## Meta
- **Industry:** Legal Tech
- **Duration:** 25 minutes
- **Target decisions:** 2000
- **Difficulty:** normal
- **Description:** A 20-person legal tech firm managing 4 active cases
simultaneously. Discovery, filings, compliance reviews, client comms.
Every case branches based on rulings, evidence discovered, and
opposing counsel's moves.
## Phases
### Phase 1: Case Intake (ticks 1โ€“50)
New cases arrive. Conflict checks, engagement letters, initial research.
- **Unlocks epics:** Case Alpha: Patent Infringement, Case Beta: Data Breach Class Action,
Case Gamma: Regulatory Compliance Audit, Case Delta: Contract Dispute
- **Events enabled:** conflict-of-interest, rush-filing, new-evidence
- **Transition:** all intake tasks complete
### Phase 2: Discovery & Research (ticks 51โ€“180)
The deep work. Document review, depositions, expert analysis.
Discovery is where the decisions multiply โ€” every document reviewed is a decision.
- **Unlocks epics:** Alpha Discovery, Beta Discovery, Gamma Compliance Matrix, Delta Mediation Prep
- **Events enabled:** all
- **Transition:** 60%+ discovery complete across all cases
### Phase 3: Filing & Motions (ticks 181โ€“300)
Court deadlines. Motions to file. Opposing counsel's responses.
Every filing can be contested, amended, or rejected.
- **Unlocks epics:** Alpha Motion for Summary Judgment, Beta Class Certification,
Gamma Regulatory Submission, Delta Settlement Negotiation
- **Events enabled:** all + court-ruling, judge-order, opposing-motion
- **Difficulty modifier:** 1.5
- **Transition:** all cases resolved or at trial stage
### Phase 4: Resolution (ticks 301โ€“400)
Cases settle, go to trial, or get dismissed.
- **Unlocks epics:** case-specific resolution epics based on branching
- **Transition:** scenario complete
## Epics
### Case Alpha: Patent Infringement โ€” Discovery
- **Phase:** Discovery & Research
- **Domain:** litigation, research
- **Priority:** high
#### Task Templates
1. **Document Collection** [research]
- Identify custodians โ†’ Issue hold notices โ†’ Collect documents โ†’ Process for review
- Generates: 200+ document-review subtasks (batch of 10 per task)
- Each batch: 3 ticks, decision: relevant / privileged / responsive / junk
- Cross-dept: privileged docs โ†’ trigger privilege log task
2. **Prior Art Search** [research]
- Define search terms โ†’ Patent database search โ†’ Academic search โ†’ Analyze results
- Decision point: prior art found (40%) โ†’ changes case strategy
- Cross-dept: if found โ†’ create "Amend Complaint" task in litigation
3. **Expert Witness Engagement** [operations]
- Identify experts โ†’ Conflict check โ†’ Engagement letter โ†’ Initial briefing
- Resource contention: only 2 expert budget slots for 4 cases
- Duration: 5โ€“8 ticks per subtask
4. **Deposition Preparation** [litigation]
- Review witness list โ†’ Prepare questions โ†’ Mock deposition โ†’ Final prep
- Dependencies: Document Collection (70%+)
- Decision: opposing counsel moves to quash (20%) โ†’ creates motion task
## Events
### court-ruling
- **Type:** narrative + interrupt
- **Probability:** 0.04 per tick (Phase 3+)
- **Effect:**
- Judge rules on a pending motion
- Outcomes (weighted): granted (40%), granted in part (30%),
denied (20%), denied with sanctions (10%)
- Each outcome creates different follow-up tasks
- "Denied with sanctions": critical โ€” creates emergency compliance tasks + billing writedown
- **Narrative:** "โš–๏ธ Judge Morrison ruled on the motion to compel: GRANTED IN PART.
Production deadline moved up 2 weeks."
### new-evidence
- **Type:** expansion
- **Probability:** 0.05 per tick
- **Effect:**
- Surprise evidence surfaces (whistleblower, opposing production, public records)
- Creates 5โ€“10 new document review tasks
- May change case strategy: decision point for lead attorney
- 20% chance: evidence is so significant it triggers settlement discussion
- **Narrative:** "๐Ÿ“ New evidence just dropped in Case Beta. 4,000 pages of internal emails.
Compliance team is scrambling."
### billing-audit
- **Type:** disruption
- **Probability:** 0.03 per tick
- **Effect:**
- Client questions billing on a case
- Creates "Billing Reconciliation" task
- Finance lead must review and justify hours
- If hours excessive: client demands write-down (budget impact)
- **Narrative:** "Client for Case Gamma is questioning the $45,000 in research hours. Finance is reviewing."

Compliance creates review loops. KYC is a dependency nightmare.

# Fintech Startup: Series B Quarter
## Meta
- **Industry:** Fintech
- **Duration:** 20 minutes
- **Target decisions:** 1600
- **Description:** A fintech startup post-Series B. Launching new products while
navigating regulatory mazes, audit prep, and the ever-present fraud pipeline.
## Key Mechanics
- **KYC Pipeline:** Every new customer triggers: identity verify โ†’ document check โ†’
risk scoring โ†’ compliance review โ†’ approve/deny/escalate.
At scale, 50+ KYC applications per scenario phase.
- **Regulatory Review Loops:** Any product change requires: legal review โ†’ compliance check โ†’
regulatory filing โ†’ approval (avg 2.3 iterations before approval)
- **Fraud Alerts:** ML model flags transactions. Each flag: investigate โ†’ classify โ†’
action (block/allow/escalate). ~30 alerts per phase.
- **Audit Trail:** Everything generates audit events. Auditor arrives in Phase 3
and requests documentation for everything.
# Game Studio: Ship the RPG
## Meta
- **Industry:** Game Studio
- **Duration:** 25 minutes
- **Target decisions:** 1800
- **Description:** An indie studio shipping a mid-size RPG. Art pipeline,
sprint cycles, QA hell, and the dreaded launch day.
## Key Mechanics
- **Art Pipeline:** Concept โ†’ Model โ†’ Texture โ†’ Rig โ†’ Animate โ†’ Review.
Each asset is 6 subtasks ร— 50+ assets = 300+ art subtasks.
- **Sprint Cycles:** 2-week sprints within the scenario. Sprint planning,
daily standups (automated check-ins), sprint review, retro.
- **QA Gauntlet:** Every feature goes through: smoke test โ†’ regression โ†’
performance โ†’ compatibility โ†’ certification. Bugs found loop back to dev.
- **Platform Certification:** Console cert is a gatekeeper. Fail โ†’ 30+ ticks
of fix-and-resubmit.
# Open Source Project: v2.0 Release
## Meta
- **Industry:** Open Source
- **Duration:** 15 minutes
- **Target decisions:** 1200
- **Description:** A popular open source project preparing a major version release.
Community PRs, breaking changes, documentation, governance.
## Key Mechanics
- **PR Triage:** Community PRs arrive as events. Each: review โ†’ test โ†’
merge/reject/request-changes. 40+ PRs per scenario.
- **Breaking Change Process:** RFC โ†’ Discussion โ†’ Vote โ†’ Implementation โ†’ Migration guide.
Each breaking change is a 20-decision epic.
- **Community Management:** Angry issue authors, feature requests, security reports,
CoC violations. Each is an interrupt event.
- **Release Engineering:** Branch โ†’ Freeze โ†’ RC1 โ†’ Bug reports โ†’ RC2 โ†’ Final โ†’ Tag โ†’ Publish โ†’ Announce.

The Scenario Engine wraps the existing DeterministicSimulation class. No rewrites โ€” only extensions.

tools/sandbox/src/scenario-engine.ts
import { DeterministicSimulation } from "./deterministic.js";
import type { SandboxAgent, SandboxTask, SandboxEvent, ACPMessage } from "./types.js";
interface ScenarioDefinition {
meta: ScenarioMeta;
phases: ScenarioPhase[];
epics: EpicTemplate[];
events: EventTemplate[];
resources: ResourcePool[];
scoring: ScoringConfig;
}
class ScenarioEngine {
private sim: DeterministicSimulation;
private scenario: ScenarioDefinition;
private currentPhase: number = 0;
private activeEpics: Map<string, EpicInstance> = new Map();
private dag: DependencyGraph = new DependencyGraph();
private eventScheduler: EventScheduler;
private resources: ResourceManager;
private storyState: Map<string, any> = new Map();
private prng: SeededRandom;
private decisionCount: number = 0;
constructor(sim: DeterministicSimulation, scenario: ScenarioDefinition) {
this.sim = sim;
this.scenario = scenario;
this.prng = new SeededRandom(scenario.meta.seed);
this.eventScheduler = new EventScheduler(scenario.events, this.prng);
this.resources = new ResourceManager(scenario.resources);
}
/** Called BEFORE each sim tick โ€” injects scenario-driven work */
preTick(): void {
// 1. Check phase transitions
this.evaluatePhaseTransition();
// 2. Expand any newly unlocked epics into tasks
this.expandUnlockedEpics();
// 3. Resolve DAG โ€” unblock tasks whose dependencies are met
this.dag.resolve(this.sim.tasks);
// 4. Fire scheduled/random events
const events = this.eventScheduler.tick(
this.sim.tick,
this.currentPhase,
this.scenario.phases[this.currentPhase],
this.storyState,
);
for (const event of events) {
this.fireEvent(event);
}
// 5. Update resource pools
this.resources.tick(this.sim.agents, this.sim.tasks);
// 6. Feed work into simulation via processOrder or direct task injection
this.injectPendingWork();
}
/** Called AFTER each sim tick โ€” collects metrics, counts decisions */
postTick(): void {
this.decisionCount += this.countNewDecisions();
this.calibratePacing();
}
/** Adjusts event frequency to hit target decision rate */
private calibratePacing(): void {
const targetRate = this.scenario.meta.targetDecisions / this.estimateTotalTicks();
const actualRate = this.decisionCount / this.sim.tick;
if (actualRate < targetRate * 0.8) {
this.eventScheduler.increaseFrequency(1.2);
} else if (actualRate > targetRate * 1.2) {
this.eventScheduler.decreaseFrequency(0.8);
}
}
}

New types that complement (not replace) existing ones:

// Additions to types.ts
// Epic: a large body of work that decomposes into tasks
interface SandboxEpic {
id: string;
title: string;
phase: string;
domain: string[];
priority: SandboxTask["priority"];
status: "locked" | "active" | "done";
taskIds: string[];
completionPct: number;
unlockedAt?: number; // tick when epic became active
completedAt?: number; // tick when epic finished
}
// Extended task with dependency info
interface SandboxTaskV2 extends SandboxTask {
epicId?: string; // parent epic
parentTaskId?: string; // parent task (for subtasks)
dependsOn?: string[]; // task IDs that must complete first
triggers?: TaskTrigger[]; // what happens when this task completes
resourceCost?: Record<string, number>; // resources consumed
reviewLoop?: {
// review iteration tracking
maxIterations: number;
currentIteration: number;
weights: number[]; // probability distribution for pass/revise/reject
};
}
// New ACP message types
type ACPMessageTypeV2 =
| ACPMessage["type"]
| "intel_report" // reef war: scout reports
| "resource_alert" // resource pool running low
| "event_alert" // random event notification
| "phase_change" // scenario phase transition
| "decision_request" // requires manager decision
| "battle_report"; // reef war: combat results
// Extended event with scenario metadata
interface SandboxEventV2 extends SandboxEvent {
scenarioEvent?: string; // which scenario event template triggered this
phaseId?: string; // which phase we're in
epicId?: string; // which epic this relates to
visualEffect?: string; // hint to dashboard for special rendering
}

The key integration point โ€” the Scenario Engine hooks into the existing tick loop:

// Modified DeterministicSimulation.runTick()
async runTick(): Promise<void> {
this.tick++;
// โ•โ•โ• NEW: Scenario pre-tick โ•โ•โ•
if (this.scenarioEngine) {
this.scenarioEngine.preTick();
}
// โ•โ•โ• EXISTING: Process agents by level (top-down) โ•โ•โ•
const sortedAgents = [...this.agents].sort((a, b) => b.level - a.level);
for (const agent of sortedAgents) {
if (agent.role === 'coo' || agent.level >= 9) {
this.tickCOO(agent);
this.tickUnblock(agent);
} else if (agent.role === 'lead') {
this.tickLead(agent);
this.tickUnblock(agent);
} else {
this.tickWorker(agent);
}
}
// โ•โ•โ• NEW: Scenario post-tick โ•โ•โ•
if (this.scenarioEngine) {
this.scenarioEngine.postTick();
}
// โ•โ•โ• EXISTING: Metrics โ•โ•โ•
this.metricsHistory.push({ ... });
}

Resolves task dependencies each tick:

class DependencyGraph {
private edges: Map<string, string[]> = new Map(); // taskId โ†’ [dependsOn...]
addDependency(taskId: string, dependsOn: string): void {
const deps = this.edges.get(taskId) || [];
deps.push(dependsOn);
this.edges.set(taskId, deps);
}
/** Check all blocked tasks. Unblock any whose dependencies are met. */
resolve(tasks: SandboxTask[]): string[] {
const taskMap = new Map(tasks.map((t) => [t.id, t]));
const unblocked: string[] = [];
for (const [taskId, deps] of this.edges) {
const task = taskMap.get(taskId);
if (!task || task.status !== "blocked") continue;
if (task.blockedReason !== "Dependency not ready") continue;
const allMet = deps.every((depId) => {
const dep = taskMap.get(depId);
return dep && dep.status === "done";
});
if (allMet) {
task.status = "assigned";
task.blockedReason = undefined;
unblocked.push(taskId);
}
}
return unblocked;
}
/** Topological sort for visualization (shows critical path) */
criticalPath(tasks: SandboxTask[]): string[] {
// ... standard topological sort with longest-path calculation
}
}

For the Reef War scenario, two DeterministicSimulation instances run in parallel:

class AdversarialScenarioEngine {
private allianceSim: DeterministicSimulation;
private dominionSim: DeterministicSimulation;
private allianceScenario: ScenarioEngine;
private dominionScenario: ScenarioEngine;
private sharedWorld: WorldState; // territory map, shared events
async runTick(): Promise<void> {
// Both sides pre-tick (generate work based on shared world state)
this.allianceScenario.preTick();
this.dominionScenario.preTick();
// Both sides process (agents work, make decisions)
await this.allianceSim.runTick();
await this.dominionSim.runTick();
// Resolve conflicts (both sides claiming same territory, combat outcomes)
this.resolveConflicts();
// Update shared world state
this.sharedWorld.update();
// Both sides post-tick (react to new world state)
this.allianceScenario.postTick();
this.dominionScenario.postTick();
}
private resolveConflicts(): void {
// Combat resolution: compare force strength, terrain, morale
// Territory changes: update map based on combat outcomes
// Resource effects: supply raids, blockades
// Intel: what each side learns about the other
}
}

Before a scenario starts, users see a selection screen:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐ŸŒŠ BikiniBottom Scenarios โ”‚
โ”‚ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ ๐Ÿค– AI Dev โ”‚ โ”‚ ๐Ÿ โš”๏ธ Reef War โ”‚ โ”‚ โš–๏ธ Legal Tech โ”‚ โ”‚
โ”‚ โ”‚ Agency โ”‚ โ”‚ โ”‚ โ”‚ Firm โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ โ”‚ MOST POPULAR โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ 20 min โ”‚ โ”‚ 30 min โ”‚ โ”‚ 25 min โ”‚ โ”‚
โ”‚ โ”‚ ~1800 dec โ”‚ โ”‚ ~2500 dec โ”‚ โ”‚ ~2000 dec โ”‚ โ”‚
โ”‚ โ”‚ โ˜…โ˜…โ˜…โ˜†โ˜† โ”‚ โ”‚ โ˜…โ˜…โ˜…โ˜…โ˜… โ”‚ โ”‚ โ˜…โ˜…โ˜…โ˜…โ˜† โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ ๐Ÿ’ณ Fintech โ”‚ โ”‚ ๐ŸŽฎ Game โ”‚ โ”‚ ๐ŸŒ Open โ”‚ โ”‚
โ”‚ โ”‚ Startup โ”‚ โ”‚ Studio โ”‚ โ”‚ Source โ”‚ โ”‚
โ”‚ โ”‚ 20 min โ”‚ โ”‚ 25 min โ”‚ โ”‚ 15 min โ”‚ โ”‚
โ”‚ โ”‚ ~1600 dec โ”‚ โ”‚ ~1800 dec โ”‚ โ”‚ ~1200 dec โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”‚
โ”‚ Difficulty: [Easy] [Normal] [โ–  Hard] [Chaos] Seed: [auto] โ”‚
โ”‚ โ”‚
โ”‚ [โ–ถ Start Scenario] โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

When a phase transitions, a cinematic banner sweeps across the dashboard:

โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—
โ•‘ โ•‘
โ•‘ โš”๏ธ PHASE 3: FIRST BLOOD โ•‘
โ•‘ โ•‘
โ•‘ Scout reports confirmed: enemy positions at D4, E3 โ•‘
โ•‘ General Mantis Shrimp is mobilizing strike teams โ•‘
โ•‘ โ•‘
โ•‘ New objectives unlocked: โ•‘
โ•‘ โ€ข Border Skirmish Alpha โ•‘
โ•‘ โ€ข Emergency Resupply โ•‘
โ•‘ โ€ข Propaganda Campaign โ•‘
โ•‘ โ•‘
โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

The org chart becomes the primary visual for scenarios:

Normal state: Agents are nodes, connections show reporting lines. Idle agents are dim, busy agents glow.

During active work:

  • Messages fly along connection lines as animated particles
  • Agents pulse when making decisions (brighter = more critical)
  • Blocked agents show a red border with a pulsing lock icon
  • Cross-department messages arc across the chart in distinct colors

During Reef War:

  • Split screen: two org charts side by side
  • Territory map in the center
  • Attack messages show as red arrows between the two orgs
  • Intel messages show as yellow dashed lines
  • When a unit is lost, its node fades and cracks
  • When a battle is won, victorโ€™s section glows gold

Ambient animations:

  • Scouts have a radar-sweep animation on their nodes
  • Supply runners have tiny package icons traveling along their connections
  • Engineers have a building animation (tiny coral growing)
  • Medics have a pulse/heartbeat animation
  • Diplomats have a handshake animation when negotiating

A real-time updating hex grid:

State: Phase 3 โ€” Skirmish Active
๐Ÿชธ ๐Ÿชธ ๐Ÿชธ โš”๏ธ ๐ŸŒซ๏ธ ๐ŸŒซ๏ธ
๐Ÿชธ ๐Ÿชธ โš”๏ธ ๐Ÿ” ๐ŸŒซ๏ธ ๐ŸŒฟ
๐Ÿชธ ๐Ÿฐ ๐ŸŒŠ ๐ŸŒŠ ๐ŸŒฟ ๐ŸŒฟ
๐ŸŒŠ ๐ŸŒŠ ๐Ÿ’ฅ ๐ŸŒŠ ๐ŸŒฟ ๐ŸŒฟ
๐ŸŒซ๏ธ ๐ŸŒŠ ๐ŸŒฟ ๐ŸŒฟ ๐Ÿฐ ๐ŸŒฟ
๐ŸŒซ๏ธ ๐ŸŒซ๏ธ ๐ŸŒฟ ๐ŸŒฟ ๐ŸŒฟ ๐ŸŒฟ
Legend:
๐Ÿชธ Coral territory ๐ŸŒฟ Kelp territory ๐ŸŒซ๏ธ Fog of war
๐Ÿฐ Fortification โš”๏ธ Skirmish active ๐Ÿ’ฅ Battle!
๐Ÿ” Scout present ๐ŸŒŠ Neutral water

Tiles animate on state change:

  • Fog clears with a dissolve effect when scouted
  • Battles show explosion particles
  • Fortifications build up brick-by-brick
  • Territory captures sweep the new color across the tile

A scrolling timeline at the bottom of the dashboard showing narrative events:

TICK 156 ๐Ÿ Commander Eel: "Intel confirmed โ€” enemy forward base at D4."
TICK 158 ๐Ÿš Admiral Nautilus: "Prepare Strike Group Alpha. Defensive posture."
TICK 161 โš”๏ธ SKIRMISH at C4 โ€” 3 Warrior Crabs vs 2 Kelp Sentinels
TICK 163 ๐Ÿฆ General Mantis Shrimp: "Push forward! They're retreating!"
TICK 165 ๐Ÿšจ Supply raid! 40 kelp rations lost at B4.
TICK 167 ๐Ÿ‹ Quartermaster Whale: "Rerouting supplies through A3. ETA 8 ticks."
TICK 170 โœ… SKIRMISH RESOLVED โ€” Coral Reef VICTORY at C4 (+1 territory)
TICK 172 ๐Ÿฌ Ambassador Dolphin: "Deep Reef Confederation is open to talks."
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐Ÿชธ Coral Reef Alliance Resources โ”‚
โ”‚ โ”‚
โ”‚ Kelp Rations โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘ 73% โ”‚
โ”‚ Shell Ammo โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘ 48% โ”‚
โ”‚ Coral Material โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘ 68% โ”‚
โ”‚ Morale โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘ 87% โ”‚
โ”‚ Intel Points โ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ 31 โ”‚
โ”‚ Alliance โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘ 56% โ”‚
โ”‚ โ”‚
โ”‚ Burn Rate: -4.5 rations/tick โ”‚
โ”‚ Resupply ETA: 6 ticks โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

A high-density view of every decision as it happens:

#1847 ๐Ÿก Captain Barracuda โ†’ ATTACK sector D4 (priority override)
#1848 ๐Ÿฆ€ Warrior Crab-2 โ†’ ACKNOWLEDGE attack order
#1849 ๐Ÿ‹ Quartermaster โ†’ APPROVE ammo requisition (30 shells)
#1850 ๐Ÿ™ Octopus Agent โ†’ REPORT enemy reserves low (confidence: 72%)
#1851 ๐Ÿ Commander Eel โ†’ VERIFY intel (cross-reference with Scout Alpha-1)
#1852 ๐Ÿš Admiral Nautilus โ†’ DECISION: proceed with flanking maneuver
#1853 ๐Ÿฆ General Mantis โ†’ DEPLOY reserve unit to C3
#1854 ๐ŸŒบ Dr. Anemone โ†’ TRIAGE: 2 wounded, 1 critical

Each decision is a row with: number, agent avatar, action verb, and details. Color-coded by type: green = progress, red = block/escalate, blue = command, yellow = intel.


Goal: Scenario Engine can load SCENARIO.md and feed work into the existing simulation.

Deliverables:

  • SCENARIO.md parser (markdown โ†’ ScenarioDefinition)
  • ScenarioEngine class with preTick/postTick hooks
  • Phase Manager (tick-based transitions only)
  • Task Generator (expand epic templates into tasks/subtasks)
  • Integration with DeterministicSimulation (hook into runTick)
  • One working scenario: โ€œAI Dev Agency โ€” Simple Sprintโ€ (500+ decisions)

Key files:

  • tools/sandbox/src/scenario-engine.ts โ€” core engine
  • tools/sandbox/src/scenario-parser.ts โ€” SCENARIO.md parser
  • tools/sandbox/scenarios/ai-dev-agency-simple.md โ€” first scenario

Goal: Dependencies, events, and resource contention working.

Deliverables:

  • DependencyGraph (DAG resolver with topological sort)
  • EventScheduler (seeded PRNG, cooldowns, chaining)
  • ResourceManager (pools, burn rates, alerts)
  • Decision Evaluator (weighted outcomes for reviews, approvals)
  • Review loops (reject โ†’ rework โ†’ resubmit cycle)
  • Cross-department triggers
  • Resource contention resolution
  • Upgraded scenario: โ€œAI Dev Agency โ€” Full Sprintโ€ (1500+ decisions)

Key files:

  • tools/sandbox/src/dag.ts โ€” dependency graph
  • tools/sandbox/src/events.ts โ€” event scheduler
  • tools/sandbox/src/resources.ts โ€” resource management
  • tools/sandbox/src/decisions.ts โ€” weighted decision evaluation

Goal: Scenarios are visually compelling on the dashboard.

Deliverables:

  • Phase banner component
  • Enhanced org chart animations (message particles, glow states)
  • Event timeline component
  • Resource dashboard component
  • Scenario selector screen
  • Score card (end-of-scenario summary with shareable card)
  • Decision feed component

Goal: The Ocean Reef War scenario runs with two competing orgs.

Deliverables:

  • AdversarialScenarioEngine (dual simulation runner)
  • WorldState (shared territory map)
  • Combat resolution system
  • Fog of war mechanic
  • Territory map visualization
  • Split-screen org charts
  • Full Reef War scenario: โ€œBattle for the Abyssal Trenchโ€ (2500+ decisions)
  • Kelp Forest Dominion ORG.md (auto-generated mirror)

Goal: All six scenarios playable. Polish and sharing.

Deliverables:

  • Legal Tech scenario
  • Fintech Startup scenario
  • Game Studio scenario
  • Open Source Project scenario
  • Difficulty system (easy/normal/hard/chaos)
  • Seed sharing (โ€œtry seed #42069โ€)
  • Twitter card generation
  • Achievement system
  • Scenario completion statistics (leaderboard?)

Goal: Users create and share their own scenarios.

Deliverables:

  • Scenario editor (web UI for building SCENARIO.md)
  • Scenario gallery (community-shared scenarios)
  • Scenario validation (lint + dry-run to catch errors)
  • Custom ORG.md + SCENARIO.md pairing
  • Scenario remix (fork + modify)

TermDefinition
EpicLarge body of work containing multiple tasks. Tied to a phase.
PhaseMajor stage of a scenario with its own pacing, events, and epics.
DAGDirected Acyclic Graph โ€” the dependency structure between tasks.
DecisionAny action taken by an agent: assign, ack, progress, review, complete, escalate, etc.
TickOne simulation cycle. ~600โ€“1000ms of wall-clock time.
SeedPRNG seed for reproducible runs. Same seed + same scenario = same outcome.
FrictionOrganizational overhead: dependencies, reviews, contention, events.
Fog of WarInformation asymmetry โ€” agents make decisions with incomplete information.
Story StateKey-value map tracking narrative branch decisions for conditional content.
Adversarial ModeTwo simulations running against each other with a shared world state.
scenario := meta phases epics events? resources? scoring?
meta := "## Meta" newline (meta-field newline)*
meta-field := "- **" key ":** " value
phases := "## Phases" newline (phase newline)*
phase := "### " phase-name " (" tick-range ")" newline
prose newline
(phase-field newline)*
("#### " sub-section newline prose newline)*
phase-field := "- **" key ":** " value
epics := "## Epics" newline (epic newline)*
epic := "### " epic-name newline
(epic-field newline)*
("#### Task Templates" newline (task-template newline)*)?
epic-field := "- **" key ":** " value
task-template := number ". **" task-name "** [" domain "]" newline
(" - " subtask-or-field newline)*
events := "## Events" newline (event newline)*
event := "### " event-id newline
(event-field newline)*
event-field := "- **" key ":** " value
resources := "## Resources" newline (resource newline)*
resource := "### " resource-name newline
(resource-field newline)*
scoring := "## Scoring" newline prose

Every decision in the simulation falls into one of these categories:

CategoryExamplesAvg per occurrenceVisual signal
CommandDelegate task, approve plan, allocate resources1Blue pulse
ExecutionStart work, make progress, submit deliverable1โ€“3Green pulse
ReviewApprove, reject, request changes1Yellow pulse
CommunicationAck, progress report, escalation1Message particle
HiringSpawn agent, assign mentor, first task3โ€“4New node animation
ContentionResource conflict, priority override, deadlock2โ€“4Red/orange pulse
Event ResponseTriage interrupt, reassign, emergency task5โ€“15Alert animation
StrategicPhase transition, scope cut, strategy choice1โ€“3Phase banner
DiplomaticNegotiate, offer tribute, form alliance3โ€“6Handshake animation
CombatEngage, retreat, flank, resupply (Reef War only)2โ€“4Sword animation

The Scenario Engine transforms BikiniBottom from a toy demo into something people actually want to watch. Itโ€™s the difference between a sandbox with three blocks and SimCity. Build the engine, and the scenarios write themselves.