Using AI to help build effective mental models of codebases
Some practical insights in combining AI with evidence-based techniques for building better mental models of codebases.

At first a new codebase might often seem like a maze.
As software engineers, most of us will often be dropped into a new codebase, or asked to contribute to a feature built by others. You might have documentation, knowledge transfer sessions, or teammates who know the system well, but a large part of the understanding still has to happen in your own head. No matter how good the support is, you still end up spending days trying to build a mental model of how things fit together. At the same time, the pace of development is increasing due to AI-assisted coding. Teams are shipping features faster than before, which means expectations around getting up to speed are changing as well.
If you are in a project or setup where the use of AI tools are allowed, and if you are not using those during onboarding, you’re probably making things harder than they need to be.
In this post, I’ll share the general approach I use to combine AI with some evidence-based techniques when I’m learning a fresh codebase or a new area of an existing one.
Start from the domain, not the code
When I start exploring a system, I usually begin from the domain side and only then move into the code. Trying to read source files first and map them back to business behaviour rarely works well at the beginning.
This is a place where large language models are genuinely useful. You can give them project documentation, specs, or even ask them to explore a specific part of the codebase and generate summaries, diagrams, flow charts, or mind maps (Mermaid works well for this).
Personally, I find graphical representations much easier to digest than pages of documentation or hundreds of lines of unfamiliar code.
Decompose and blackbox early
Another principle that shows up everywhere in computing is decomposition: break the problem down into bite sized pieces, pick a module or feature, and focus on that first.
At the same time, you don’t always need to understand everything in depth from day one. If you try to dig into every edge case and abstraction immediately, it’s easy to get stuck.
A useful technique here is deliberate blackboxing.
When you’re starting out, it’s fine to treat some modules, services, or complex methods as black boxes. You only need to know what goes in, what comes out, and why they exist. Well-designed codebases already encourage this through abstractions. Use these existing abstractions to help navigate the codebase. Once you’re comfortable with the bigger picture, you can come back and look inside those components later.
Focus on the 20% that matters
In large repositories, most of the time only a portion of the codebase is relevant to what you’re doing at any given time. This is essentially the 80–20 rule applied to software systems.
One place where AI helps a lot here is in figuring out what actually matters early on. I’ll often ask something like:
“If I’m working on X, which parts of this repo should I care about first?”
The answer might lack the total scope required, but it’s usually good starting point. Instead of reading everything, you get a short list of files, modules, or services that are worth your attention right now. Over time, this adds up. You spend more time learning the parts of the system that affect your work, and less time getting lost in code you won’t need.
Ask better questions
Your teammates are usually the best source of truth when you’re learning a system.
One way AI helps here is by improving how you ask questions. Before sending a message, you can use it to check whether your question has enough context, whether it’s clear, and whether you’re accidentally missing something obvious. This often leads to faster, more useful answers and better feedback loops.
Use history for context
Another underrated source of context is version history.
When I don’t understand why a piece of code exists, I’ll often use GitLens to look at the commits or pull requests that introduced it. The discussion or work items around a change usually explains the original problem better than directly observing the final implementation.
AI makes this much easier now. You can give a set of related commits, diffs, or PR descriptions and ask for a summary of what changed and why.
Most modern AI-enabled IDEs and CLI tools have some support for this already (via tools , MCPs etc.), and it’s worth using it. Instead of reading dozens of commit messages manually, you get a rough narrative of how a feature evolved. It’s not always accurate, but it’s a very fast way to build a baseline of the historical context.
Keep your mental model fresh
As you start contributing, it’s normal to lose familiarity with parts of the system you’re not touching every day. If you’ve been working in one area for months, other parts will fade. That’s normal.
What helps is revisiting those areas occasionally, even in a lightweight way. You don’t need a deep dive every time. Skimming code, rereading diagrams, or re-tracing a request flow is often enough.
This is essentially the concept of spaced repetition and active recall applied to software systems, combined with active recall. Over time, it keeps your mental model from drifting too far out of date.
In the end, learning a codebase isn’t about memorizing files or understanding every abstraction immediately. It’s about building a working mental model, something good enough to guide your decisions and evolve over time. AI doesn’t remove the need for that mental model. If anything, it makes it more important.
A new codebase will probably always feel like a maze at first. That part doesn’t change. The important point is that how well and how quickly you can get confident to navigate through it.