Controller & agents

The coordination protocol

Controller and agents speak loadr.coordination.v1 — a single bidirectional gRPC stream per agent (see ADR-003):

agent ──▶ Register{agent_id, name, protocol_version, cores, labels}
      ◀── Registered{controller_id}
      ◀── Assignment{run_id, plan_yaml, partition i/n, data files}
      ◀── Start{run_id, start_unix_ms}          # synchronized barrier
      ──▶ MetricsBatch{run_id, delta}           # every second
      ──▶ Heartbeat{active_vus, run_state}      # every 2 seconds
      ◀── Control{stop|kill|pause|resume|scale}
      ──▶ RunEvent{started|finished|failed, summary}

The protocol is versioned; an agent with an incompatible protocol_version is rejected at registration.

TLS / mTLS

loadr controller --bind 0.0.0.0:7625 \
  --tls-cert server.pem --tls-key server-key.pem \
  --tls-client-ca clients-ca.pem          # require client certs (mTLS)

loadr agent --join ctrl:7625 \
  --tls-ca ca.pem \
  --tls-cert agent.pem --tls-key agent-key.pem

Without flags the channel is plaintext — fine on a private network, not on the internet.

Failure handling

  • Heartbeats every 2 s; an agent silent past the liveness window (default 6 s) is marked unhealthy.
  • Reconnection: agents reconnect with jittered exponential backoff and re-register, resuming their identity.
  • Agent loss during a run is policy-driven per submission:
    • continue (default) — remaining agents keep their share; the lost agent's portion of the load simply stops (the summary notes the reduced fleet).
    • abort — the controller stops the run everywhere.

Data files

CSV files, JS modules, proto files and body files referenced by the test are shipped inside the assignment and materialized in the agent's working directory. Paths are sanitized — anything containing .. or absolute paths is rejected.

Operating notes

  • Agents are stateless; scale them with your orchestrator (kubectl scale deploy/loadr-agent --replicas=20).
  • One controller handles many sequential/concurrent runs; each run records its agent set at submission time.
  • The web UI on the controller shows the fleet (health, VUs, labels, last heartbeat) and every run's live metrics.