Understanding Context Rot in Claude Code

Explore the phenomenon of context rot in Claude Code, its implications for performance, and strategies to manage session context effectively.

Claude Code can sometimes seem to lose its sharpness during conversations, resembling a colleague who has had a few too many drinks—apologizing, making mistakes, and repeating the cycle. You might wonder if it’s having a bad day or if you haven’t communicated clearly enough.

However, Thariq Shihipar from Anthropic recently addressed this in a blog post, explaining that it’s not a bug but rather a phenomenon called Context Rot. This means that as a conversation progresses, the context within a session deteriorates, affecting performance.

What is Context?

Context refers to the total amount of information that Claude can “remember” during a session, including your inputs, its responses, the documents it has read, and the results from tools it has used. Imagine it as a table in Claude’s mind, which currently has a capacity of 1 million tokens—equivalent to about 600,000 to 700,000 Chinese characters, or roughly three long novels.

While this sounds impressive, the reality is that as context length increases, model performance degrades. Attention gets divided among more tokens, and older, irrelevant content begins to distract from the current task. This gradual decline in performance is crucial to understand; it’s not an immediate crash but a slow slide that can go unnoticed.

Why Does Context Rot Happen?

Anthropic likens this to human memory limitations. Just like humans have a finite working memory, large language models (LLMs) have an “attention budget” that they draw upon when processing large volumes of context. A larger table doesn’t necessarily mean you can keep track of more items; in fact, it can complicate things further.

The architecture of Transformers, the backbone of these models, requires each token to relate to every other token in the window, resulting in a relationship count that is the square of the number of tokens. For example:

  • 10 tokens → 100 relationships
  • 1,000 tokens → 1 million relationships
  • 1 million tokens → 1 trillion relationships

Additionally, models have been trained on shorter dialogues and code snippets, which are far more common than longer ones. Thus, when context exceeds a certain length, performance collapses—not due to a bug, but due to inherent architectural limitations.

Insights from Research

ChromaDB released a technical report in July 2025, which Anthropic referenced as the only external study in their context engineering article. They tested 18 state-of-the-art models, including Claude, GPT, Gemini, and Qwen, revealing several counterintuitive findings:

  1. Good NIAH benchmark scores do not equate to effective long-text usage. NIAH has been used to demonstrate that long text issues are resolved, but ChromaDB proved this is a misconception.
  2. The position effect is real: Information at the beginning is more accurate, with the middle being the least reliable.
  3. Older models can outperform newer ones in short contexts. For instance, Sonnet 3.5 can outperform newer Claude in short contexts.

ChromaDB concluded that the relevance of information in a model’s context is less important than how that information is presented.

Writing and Context Rot

Writing code is one of the most affected scenarios by context rot. However, writers are not exempt; they face similar challenges, albeit to a lesser degree. When Claude reads a lot of material or revises a long document, it can also become less effective.

Reasons for Context Rot in Writing

  1. Tool outputs are substantial. Each time a programmer calls a tool, it adds thousands of tokens to the context, much of which may not be needed again.
  2. Programming tasks require multi-step reasoning. Each additional step can decrease accuracy, compounding errors with each logical hop.
  3. Old attempts can pollute new judgments. If failed debugging attempts remain in the context, the model may reuse incorrect frameworks.

Strategies to Manage Context

Thariq outlines five strategies to manage context effectively:

Option Command Purpose
Continue Send a new message Stay in the same session and continue the conversation
Rewind /rewind or ESC ESC Go back to a previous message and discard later ones
Clear /clear Start a new session and bring in necessary information
Compact /compact Compress the current session and continue based on the summary
Subagent Explicit request Deploy a new context agent to handle tasks and return only conclusions

These options represent your judgments about the current context:

  • Choosing Continue means you believe the context is clean.
  • Choosing Rewind indicates recent rounds have polluted it.
  • Choosing Clear suggests it’s time for a fresh start.
  • Choosing Compact shows trust in Claude to summarize effectively.
  • Choosing Subagent means you want to keep the main session clear of clutter.

Conclusion

Understanding context rot is crucial for optimizing interactions with Claude Code. By applying these strategies, you can enhance performance and maintain clarity in your sessions.

Was this helpful?

Likes and saves are stored in your browser on this device only (local storage) and are not uploaded to our servers.

Comments

Discussion is powered by Giscus (GitHub Discussions). Add repo, repoID, category, and categoryID under [params.comments.giscus] in hugo.toml using the values from the Giscus setup tool.