Getting AI to Work in Complex Codebases
September 30, 2025

It seems pretty well-accepted that AI coding tools struggle with real production codebases. The Stanford study on AI’s impact on developer productivity found:
- A lot of the “extra code” shipped by AI tools ends up just reworking the slop that was shipped last week.
 - Coding agents are great for new projects or small changes, but in large established codebases, they can often make developers less productive.
 
The common response is somewhere between the pessimist “this will never work” and the more measured “maybe someday when there are smarter models.”
After several months of tinkering, I’ve found that you can get really far with today’s models if you embrace core context engineering principles.
This isn’t another “10x your productivity” pitch. I tend to be pretty measured when it comes to interfacing with the ai hype machine. But we’ve stumbled into workflows that leave me with considerable optimism for what’s possible. We’ve gotten claude code to handle 300k LOC Rust codebases, ship a week’s worth of work in a day, and maintain code quality that passes expert review. We use a family of techniques I call “frequent intentional compaction” – deliberately structuring how you feed context to the AI throughout the development process.
I am now fully convinced that AI for coding is not just for toys and prototypes, but rather a deeply technical engineering craft.
One of the themes that emerged at our engineering AI conference recently was how important context is when working with large language models as a software engineer. We’ve moved far past thinking that prompts are all you need.
But there are still many who think that while one day large language models may be useful for software engineers, today it’s useful at most for vibe coding prototypes or toy applications.
Meanwhile, some software developers are exploring how we can make best use of these technologies today, and importantly, sharing what they have learned.







