ai agentskarpathy loopagent skillsprompt engineeringopenclaw

AI Agents That Improve Their Own Instructions: The Karpathy Loop for Skill Files

The Karpathy autoresearch loop is now being applied to agent skill files, letting AI agents modify their own instructions, test the result, and keep only what performs better.

April 18, 20264 min readBy AndresUpdated April 18, 2026

Andrej Karpathy ran an AI coding agent for two days straight and it executed 700 experiments without stopping. What gets less attention is that the same loop is now being pointed at something much closer to home: the instruction files your AI agent actually runs on.

TL;DR: The Karpathy autoresearch loop, an AI agent that runs experiments, scores them, and keeps only what works, has been adapted to rewrite agent instruction files instead of training models. A published framework lets an agent modify its own skill file, test the result, and keep or revert the change automatically. You do not write better prompts. The agent tests its way to them.

What's the Karpathy Loop?

Quick version: Andrej Karpathy, co-founder of OpenAI and former Tesla AI lead, published a pattern where an AI coding agent edits a single file, runs an experiment, measures the result against a clear metric, keeps the change if it scored better, reverts if it did not, and repeats. Hundreds of times. The agent does not need to understand why something works. It just needs a scoreboard and permission to keep trying.

The original use case was improving model training code. The agent ran 700 experiments over two days and produced measurably better training runs than the starting point. No human in the loop for the iterations themselves, just the setup and the scoring criteria.

Someone Just Pointed It at Agent Instructions

Here is where it gets interesting. Kirill Krainov, a Berlin-based software engineer, published a full framework design on March 25, 2026 that applies the same ratchet to agent skill files, the instruction documents that tell an AI agent how to behave, what tools to use, and how to approach tasks.

The loop works like this: the agent modifies its own instruction file, runs a defined set of test cases against the new version, and scores the result across three dimensions: did it get the right answer, how fast did it get there, and how many tokens did it burn. If the new version scores better, it stays. If it does not, the agent reverts to the previous version and tries something else.

So instead of you rewriting your agent's instructions and hoping they work better, the agent rewrites them itself and proves whether they do.

Why This Matters If You Run AI Agents

Think of it like a self-tuning engine. You set the performance targets, accuracy, speed, and cost, and the engine adjusts its own timing until it hits them. You do not need to know how engines work. You just need to know what better looks like.

For anyone running AI agents through platforms like OpenClaw, this pattern maps directly. OpenClaw agents already use SKILL.md files as their core instruction set. The autoresearch loop could theoretically improve those instructions through structured test cycles without you touching a single line.

Now for the honest part: this is early. We are talking about a few weeks of community development, one published framework, and independent confirmation that the pattern is spreading. It is not a product you can install today. But the framework is public and reproducible. Anyone with a coding agent and a scoring rubric can run it.

What To Watch For

The gap right now is between "this works in a controlled experiment" and "this runs reliably on your actual agent setup." Two things close that gap:

Standardized scoring. The loop is only as good as its metrics. If your test cases do not capture what better actually means for your use case, the agent optimizes for the wrong thing. Defining good test cases is still a human job.
Guardrails on self-modification. An agent rewriting its own instructions needs boundaries. Without constraints on what it can change, you get an agent that scores well on tests but behaves unpredictably in production. The Krainov framework includes revert-on-failure, but production safety will need more than that.

This is the kind of development that starts quietly and then shows up everywhere. The pattern is mechanical, reproducible, and does not require a new model or a vendor release. It just requires someone to connect a loop that already exists to a file that already exists.

And that has already happened.

AI Agents That Improve Their Own Instructions: The Karpathy Loop for Skill Files

What's the Karpathy Loop?

Someone Just Pointed It at Agent Instructions

Why This Matters If You Run AI Agents

What To Watch For

Related posts

Setting Up OpenClaw Without Getting Breached: What "Non-Technical Setup" Actually Means

Build Your First AI Agent With No Code

Enterprise Security Teams Are Now Publishing Risk Guides About Your AI Agent