CHAPTER 01 · 2 MIN READ

You don't need KAOS!

Why running the cluster yourself first is the right way to start.

Author Jōkamachi Systems

Stub.

The motivation behind KAOS is to derive and automate collective operating experience. But “collective experience” is built one incident at a time, and every individual contribution starts the same way: someone — a human, or a robot — runs kubectl against the cluster, or receives an alert notification, notices something is off, and fixes it.

Real experience is earned through trial. You can run those kubectl commands yourself, or you can hand them to an agentic CLI with access to the cluster — either path produces experience. Each path leaves a different gap, though. Humans produce organisational knowledge, but bring it with them when they leave the organisation. Standalone agentic CLIs don’t retain knowledge across sessions without extensive prompting. And human operators of agentic CLIs tend to overlook the solution entirely — because the agent did the fixing, and you only learn through pain.

That gap — between “I fixed it” and “we now know how to fix it” — is what KAOS is for.

Becoming a Kubernetes expert is time-consuming, and not every team has the runway for it. Skipping some understanding — taking on what we’d call knowledge debt — is sometimes the right choice, and KAOS is one of the options in that case. Where there’s a proven path through an incident class, letting an agent walk it is safe; where there isn’t, passing the situation to a human operator comes down to gut and metrics, not heroism. The booklet walks the long way around for the team that wants it; going back to learn the hard way is always available later.