GitNexus: how code knowledge graphs give AI agents architectural awareness
An AI agent that doesn't understand your code architecture will break things. GitNexus solves this by turning repositories into navigable graphs with impact analysis and hybrid search.
Table of Contents
Last week, I asked an AI agent to rename a function in a project with 200 files. It renamed it. And broke 14 imports across 9 different files. The agent had no way of knowing that function was used in so many places. It simply couldn’t see the project as a whole.
This is the core problem of using AI to write code in real projects: the agent has no architectural awareness. It sees the file you opened, maybe the neighbors, but doesn’t understand how the pieces connect.
The gap between “writing code” and “understanding code”
LLMs are excellent at generating code function by function. But a software project is not a collection of isolated functions. It’s a network of dependencies, calls, imports and side effects.
When a senior developer modifies a function, they mentally trace: “who calls this?”, “what tests cover this?”, “if I change the signature, what breaks?”. This reasoning depends on a mental model of the entire project.
AI agents don’t have that model. They compensate with text searches (grep, ripgrep), which find strings but don’t understand semantic relationships. Finding every occurrence of validateToken is not the same as understanding that validateToken is called by the authentication middleware, which is used in 47 API routes.
GitNexus: the entire project as a graph
GitNexus attacks this problem head-on. It’s a code intelligence engine that indexes entire repositories into a navigable knowledge graph, where each node is a symbol (function, class, module) and each edge is a relationship (imports, calls, extends).
The project is open-source, has over 7,500 stars on GitHub, and works both via CLI and in the browser (zero installation).
How indexing works
The indexing pipeline has 6 stages:
- Structure mapping - Traverses the file system and maps the directory/file hierarchy
- AST parsing via Tree-sitter - Extracts all symbols (functions, classes, variables, types) using native syntactic parsing. Supports 12 languages: TypeScript, JavaScript, Python, Java, Kotlin, C, C++, C#, Go, Rust, PHP and Swift
- Import and call resolution - Connects symbols: who imports whom, who calls whom
- Functional clustering - Groups related symbols into functional clusters (e.g., “authentication module”, “persistence layer”)
- Execution tracing - Maps complete execution flows across the codebase
- Hybrid search index - Builds BM25 (text search) and semantic indices, combined via Reciprocal Rank Fusion
The result is stored in KuzuDB, an embeddable graph database that runs both natively (CLI) and via WASM (browser). Each repository generates a portable index in the .gitnexus/ folder.
7 tools via MCP
What makes GitNexus particularly useful for AI agents is exposing the graph via MCP (Model Context Protocol). There are 7 tools that any compatible agent can use:
| Tool | What it does |
|---|---|
list_repos | Discovers indexed repositories |
query | Process-grouped hybrid search |
context | Symbol relationships and participations |
impact | Blast radius analysis (upstream + downstream) |
detect_changes | Impact analysis based on git diff |
rename | Coordinated rename across multiple files |
cypher | Direct Cypher queries on the graph |
The impact tool is the most powerful. You ask “what’s the impact of changing validateToken?” and get: all functions that call it (upstream), all functions it calls (downstream), and the blast radius — how many files and processes are affected. This is exactly the information the agent needed before that rename that broke 14 files.
Hybrid search: why grep isn’t enough
Most code tools for AI use pure text search. GitNexus combines three approaches:
BM25 (lexical search): finds exact term matches. Good for function names, variables, specific strings.
Semantic search: finds conceptually similar code even with different names. If you search “validate credentials”, it finds checkUserAuth even though the words don’t match.
Reciprocal Rank Fusion (RRF): combines BM25 and semantic rankings without normalizing scores. Each source votes on results and RRF generates a weighted final ranking. It’s the same algorithm we use in MCP Context Hub to combine vector and text search — it works well precisely because it doesn’t require different sources to use the same scale.
Results come grouped by process. Instead of a flat list of functions, you get: “in the authentication process, these functions are relevant; in the logging process, these others”. This gives the agent a functional view, not just structural.
Zero-server: everything runs on the client
An architectural decision worth highlighting: GitNexus has no server. All processing happens locally — in the terminal via CLI or in the browser via WebAssembly.
This solves two practical problems:
- Privacy: your code never leaves your machine. In corporate projects, this is a requirement, not a differentiator.
- Latency: without server round-trips, graph queries are sub-millisecond. An
impacton a 2,000-file repository returns in under 100ms.
The trade-off is that initial indexing takes longer (depends on project size and local hardware), but it’s done once and updated incrementally.
Web UI: interactive visualization
Beyond CLI and MCP, GitNexus offers a web interface where you can see your project’s graph. Each node is a symbol, each edge is a relationship, and you can:
- Navigate through functional clusters
- Click a node to see its code and relationships
- Filter by type (function, class, module)
- Search symbols by name or concept
This is useful not just for AI, but for developer onboarding. Instead of reading documentation (which is almost never up-to-date), a new team member can open the graph and understand the architecture in minutes.
The Web UI works 100% in the browser — you can drag and drop a folder or ZIP of your project and it indexes locally via WASM. No installation needed.
How to integrate with your agent
Claude Code
npx gitnexus analyze
npx gitnexus setup
The setup automatically detects the editor/agent and configures the MCP server. For Claude Code, it registers the server and installs skills that teach the agent to use the tools intelligently (e.g., always run impact before a refactor).
Cursor and Windsurf
setup also configures these editors. Integration is via MCP — the same 7 tools become available in the agent’s context.
Multi-repo
If you work with microservices or monorepos, GitNexus maintains a global registry at ~/.gitnexus/registry.json. You can index multiple repositories and the agent can search relationships across them.
Where this fits in the ecosystem
GitNexus solves the problem of understanding code. It doesn’t generate code, doesn’t compress context, doesn’t cache. It maps relationships.
This makes it complementary to tools like MCP Context Hub (which optimizes and persists context) and Context7 (which fetches library documentation). In practice, an intelligence stack for AI agents might have:
- GitNexus: understands the structure and relationships of your code
- Context Hub: persists decisions, compresses context, does semantic caching
- Context7: brings up-to-date documentation for external dependencies
Each solves a different dimension of the context problem. Together, they transform an agent that “writes code” into one that “understands the project”.
Technical details: how the graph is built
Understanding the graph construction process reveals why GitNexus produces reliable results. The core of the pipeline relies on Tree-sitter, a parser generator that produces concrete syntax trees. Unlike regex-based approaches, Tree-sitter understands the actual grammar of each language, which means it correctly handles edge cases like nested functions, decorators, generics, and complex import patterns.
For each file, Tree-sitter produces an AST (Abstract Syntax Tree). GitNexus walks this tree and extracts three categories of information:
Declarations: every function, class, method, type alias, interface, enum, and constant. Each gets a unique identifier based on file path and symbol name, along with metadata like line numbers, visibility (public/private), and documentation comments.
References: every usage of a declared symbol. Function calls, type annotations, variable references, import statements. Each reference creates a directed edge in the graph from the referencing symbol to the declared symbol.
Structural relationships: class inheritance, interface implementation, module re-exports, decorator applications. These create typed edges that carry semantic meaning beyond simple “uses” relationships.
The resulting graph for a typical 500-file TypeScript project contains around 15,000-25,000 nodes and 40,000-80,000 edges. Despite this density, KuzuDB handles traversal queries in single-digit milliseconds because graph databases are optimized for exactly this kind of relationship traversal.
The clustering algorithm
The functional clustering step deserves special attention because it transforms a flat symbol graph into a meaningful architectural map. GitNexus uses a community detection algorithm (similar to Louvain) that identifies groups of symbols with high internal connectivity and low external connectivity.
In practice, this naturally discovers architectural boundaries. Symbols that form the “authentication module” cluster together because they reference each other frequently but have fewer connections to, say, the “payment processing” cluster. The algorithm doesn’t know anything about your architecture. It discovers it from the code relationships.
Each cluster gets a label generated from the symbol names and file paths within it. The labels aren’t always perfect, but they’re surprisingly accurate for well-structured codebases. For a messy codebase with high coupling, the clusters reveal exactly where the architectural boundaries are blurred.
Performance: what to expect
I’ve indexed projects of various sizes to give realistic benchmarks:
| Project size | Files | Initial indexing | Incremental update | Graph query (impact) |
|---|---|---|---|---|
| Small (50 files) | 50 | 3 seconds | < 1 second | < 5ms |
| Medium (500 files) | 500 | 25 seconds | 2-3 seconds | 10-30ms |
| Large (2,000 files) | 2,000 | 2 minutes | 5-10 seconds | 50-100ms |
| Very large (10,000 files) | 10,000 | 8-12 minutes | 20-30 seconds | 100-300ms |
These numbers are from a MacBook Pro M3 with 16GB RAM. The important thing is that incremental updates are fast because GitNexus only re-indexes files that changed since the last indexing. In a typical development workflow where you modify 5-10 files between indexing runs, the update takes a few seconds regardless of project size.
For comparison, a full-text search (grep/ripgrep) across a 2,000-file project takes about 200-500ms. GitNexus graph queries are faster for relationship-based questions (“what calls this function?”) and slower for pure text matching (“find all TODO comments”). The right tool depends on the question.
Future possibilities
The knowledge graph approach opens doors that simple code search cannot. Here are directions I think are particularly promising:
Cross-repository dependency analysis
Most real projects depend on internal libraries, shared packages, or microservices. If GitNexus can index multiple repositories and connect them through their API boundaries, you get a cross-service dependency graph. Imagine asking “if I change the response format of the user-service /profile endpoint, what breaks?” and getting a complete impact analysis across all consuming services. Some of this already works with the multi-repo registry, but the cross-repo relationship detection is still basic.
Historical graph analysis
Git history contains a wealth of architectural information. Which files change together frequently? (They’re probably tightly coupled.) Which functions get modified most often? (They’re likely bug-prone or under-specified.) By combining the code graph with git history, GitNexus could identify architectural health metrics: coupling trends, change hotspots, and architectural erosion over time.
AI-assisted architecture review
With a complete code graph, an AI agent could perform architectural reviews that go beyond code style. “This module has 47 incoming dependencies, making it a high-risk change target. Consider splitting it.” Or “These two clusters have 23 cross-connections but no shared interface, suggesting a missing abstraction.” These are insights that senior architects provide during design reviews, but they require exactly the kind of structural awareness that a knowledge graph provides.
Test coverage mapping
By combining the code graph with test coverage data, GitNexus could answer questions like “what’s the test coverage of the blast radius of this change?” Not just “is this function tested?” but “are all the functions that depend on this function also tested?” This would transform impact analysis from a structural question into a risk assessment question.
Honest limitations
No tool is perfect, and it’s important to be transparent:
- Dynamic languages: call resolution in Python and JavaScript relies on static parsing, which doesn’t capture dynamic calls (e.g.,
getattr,eval). Coverage is good, but not 100%. - Non-commercial license: GitNexus uses the PolyForm Noncommercial license. You can use it freely in personal and open-source projects, but commercial use requires a separate license.
- Initial indexing: on very large projects (100k+ files), the first indexing can take a few minutes. After that, updates are incremental.
Why this matters now
We’re at an inflection point. AI agents are evolving from “autocomplete assistants” to “autonomous developers” that navigate, understand and modify entire codebases.
But autonomy without architectural awareness is dangerous. An agent that modifies code without understanding the blast radius will create more bugs than it solves. GitNexus is part of the infrastructure that makes autonomous agents reliable, not just fast.
The future of AI-powered development isn’t about bigger models. It’s about giving existing models the right context. And for code, the right context is a graph.
Links: