VOL · 03

KAOS — the Kubernetes Agentic Operator Substrate

What goes into agentic cluster management — and why you don't need KAOS to start.

Chapters 5

Stub. The chapters that exist are stubs too.

KAOS is the Kubernetes Agentic Operator Substrate — the layer agentic operators live in and act through, and the operator that ships with the Agentic pricing tier.

The booklet is structured as a progression from “you don’t need KAOS” to “you actually want KAOS, here’s why”, because that’s the order in which the value becomes apparent. KAOS is not a starting point; it’s a destination you reach after you’ve earned operational experience with the cluster yourself.

Reading order:

  1. You don’t need KAOS! Run kubectl yourself; run an agentic CLI against the cluster yourself. That’s how experience is earned. KAOS is what you reach for after you’ve done that and want to codify and share what you learned.
  2. How far one prompt takes you. A worked example: a single, generic prompt against an already-authenticated kubectl produces a competent first-line triage report. The bar a frontier model clears with no skill is already very high — and that’s the bar everything else has to improve on.
  3. How an agentic skill is formed. The progression of questions an operator (human or model) walks through during incident handling — from “are there any incidents?” to “have I seen this before, and how was it fixed?” — and where each step gets harder.
  4. Guardrails and incident maturity. How KAOS is promoted from advisory to autonomous, one incident class at a time, under per-tenant policy.

A persistent theme through the booklet is the parallel between automating cluster operations and choosing strong GitOps primitives underneath them. The Kubenix booklet covers that side of the story.

There’s also an appendix-style chapter, What KAOS sees, and what it doesn’t, covering the training loop and the privacy boundary. It’s currently a brainstorm parked in the booklet — most of it belongs in the licence terms when those exist.

  1. 01
    You don't need KAOS! Why running the cluster yourself first is the right way to start.
    2 min
  2. 02
    How far one prompt takes you What a frontier model with kubectl access already does, before you write a single line of skill or playbook.
    3 min
  3. 03
    How an agentic skill is formed From 'are there any incidents?' to 'should this go through GitOps?' — the progression of operating experience.
    2 min
  4. 04
    Guardrails and incident maturity How KAOS is promoted from advisory to autonomous, one incident class at a time.
    1 min
  5. 05
    What KAOS sees, and what it doesn't Training on incidents, the privacy boundary, and the RBAC line we draw across both.
    2 min