The graph that doesn't think

A few weeks ago, in an AI engineering group, someone shared a screenshot of their Obsidian vault and wrote: “my context for agents.” The colorful graph was there, the interconnected notes, the beautiful map of thoughts. I received three private messages from different people asking how to respond without embarrassing them.

There is no polite answer. There is only reality: that screenshot told me exactly where that person stands in their understanding of how context works in LLMs. It’s a tell. Like in poker.

This isn’t about Obsidian being good or bad. Obsidian is an honest tool and does what it sets out to do quite well. The problem is different: when you point to an Obsidian vault as your “AI context layer,” you’re revealing that the words “context,” “graph,” and “knowledge” still carry for you the meaning they have in productivity marketing, not the technical meaning that matters when you’re building systems with LLMs.

This post is for anyone who wants to cross that line.

The signal you don’t realize you’re sending

The PKM (Personal Knowledge Management) vocabulary popularized at the same time as AI vocabulary, and the words overlapped in ways that create serious confusion.

“Knowledge graph.” In Obsidian, that’s a visualization of your wikilinks. In AI, a knowledge graph is a structure with typed entities, typed relations, logical inference, and queries in SPARQL or Cypher. These are two completely different objects sharing the same name.

“Context.” In Obsidian, context is the feeling that your notes are connected. In the world of LLMs, context is the content that enters the model’s token window. What isn’t in the window doesn’t exist for the model. What’s in the window but is irrelevant degrades the quality of the response.

“Second brain.” In PKM, it’s a motivational metaphor. In AI agents, if you want something that functions as long-term memory for an agent, you need semantic retrieval, vector persistence, and some strategy for controlled forgetting.

When someone uses these words as if they were equivalent, it becomes visible that they’re navigating by tool marketing, not by the engineering underneath. That’s the tell.

The graph is decorative

To understand why, it’s worth separating the types of graphs that exist in the information field. The visual similarity between them is deceptive.

Hyperlink graph (Obsidian, Roam, Logseq): each connection is a manual decision. You type [[another note]] and the line appears. Syntax, without semantics. The system doesn’t know that “productivity” and “focus” are related unless you create that link with your own hands.

Concept map: also manual, but with named predicates. “X causes Y”, “X is type of Y.” There’s explicit semantics, though static and with coverage limited to what you formalized.

Knowledge graph (Wikidata, DBpedia, Neo4j with schema): typed entities, typed relations, structured queries. Inference possible within the defined schema. This is what engineers actually call a knowledge graph.

Embedding space (RAG, semantic search): no explicit links. Relations emerge from the vector proximity between representations of meaning. The system discovers that “productivity” and “focus” are close without you having created that link anywhere.

Comparing the actual capabilities of each architecture:

Architecture	Automatic connections	Semantic meaning	Inference
Obsidian (hyperlinks)	0/10	0/10	0/10
Concept map	0/10	6/10	1/10
Knowledge graph	4/10	8/10	7/10
Embeddings / RAG	9/10	9/10	5/10

Plain Obsidian scores near zero on almost everything associated with system “intelligence.” This isn’t a critique of the tool. It’s a property of the architecture. Plain Markdown with wikilinks has no way to perform inference. The pretty graph is a side effect of counting links you manually created, not an emergent cognitive capability.

Principle. A hyperlink graph answers the question “what did you connect?” An embedding system answers the question “what is semantically close?” These are fundamentally different questions.

The collector’s paradox

In 2014, Christian Tietze described what he called the “collector’s fallacy” in an influential essay on Zettelkasten. The idea is precise: capturing information gives an immediate feeling of intellectual progress that gets confused with actually learning it, processing it, or integrating it with what you already know.

Saving an article is easy. Reading it, integrating it, refuting it, citing it in your own text is hard. The system rewards the first behavior and rarely the second.

The result is predictable: notes created grow in an almost linear fashion over time. Notes you actually return to flatten within months and never recover. What a user has after two years isn’t an extended brain. It’s an organized cemetery of half-read articles.

This pattern exists in any note-taking tool, but it intensifies in Obsidian because the visual interface rewards accumulation. That colorful graph getting denser looks like progress. It isn’t. It’s accumulation. These are different things.

When someone says “I have a thousand notes in Obsidian about machine learning,” the relevant question isn’t “how many notes do you have?” It’s: “in how many of them can you perform automatic semantic retrieval from a natural language query?” The answer, in plain Obsidian, is zero.

What “AI context” actually means

When an engineer talks about “context” in the technical sense, they’re talking about something specific: the content that enters the model’s context window. That window has a limit in tokens. What’s in it, the model sees. What isn’t, doesn’t exist for it.

The art of working with LLMs is, to a large extent, the art of deciding what to put in the window. The dimensions that matter:

1. Semantic retrieval. Finding relevant content even when the exact words don’t match. If a user asks about “latency in distributed systems” and you have a note about “response time in microservices,” a semantic retrieval system finds that note. A substring search in Obsidian doesn’t.

2. Retrieval precision. Throwing a thousand irrelevant tokens into an LLM’s context doesn’t help. It hurts. The model has more noise to filter. Research on “lost in the middle” (Liu et al., 2023) shows that LLMs degrade when context is long and relevant content is buried in the middle. Precise retrieval is better than generous retrieval.

3. Continuous update. A useful memory system for agents needs to incorporate new information without manual reorganization. Embeddings in a vector store allow this. An Obsidian vault doesn’t.

4. Synthesis. The ideal system doesn’t just find relevant fragments. It combines them to produce something that wasn’t explicitly written anywhere.

Comparing cognitive capabilities:

Capability	Obsidian (plain)	RAG System	Human brain
Fast capture	9/10	7/10	8/10
Semantic retrieval	1/10	9/10	10/10
Inference	0/10	4/10	10/10
Synthesis	0/10	6/10	10/10
Continuous update	2/10	7/10	10/10

Obsidian excels at capture. It fails at everything an AI context system needs to do.

Tool is not context

The confusion has an epistemic root. The method (CODE, PARA, Zettelkasten) and the tool (Obsidian) were fused together in the public imagination as if they were the same thing. They aren’t. The method is cognitively real and tested. The tool is just one of many possible supports for the method.

Luhmann produced 90,000 paper cards and one of the most sophisticated works of social theory of the 20th century. No graph, no backlinks, no plugins. The method produced the result. The support was just paper and boxes.

Now, when the method migrates to the AI context, the tool matters differently. Not because Obsidian is inadequate as a writing tool. But because the operations that an AI agent needs to perform on a knowledge base, semantic retrieval, reranking by relevance, intelligent chunking, vector persistence, are not available in plain Markdown files with wikilinks.

Dumping your entire vault into an LLM prompt isn’t using your vault as context. It’s using your vault as noise.

What separates amateurs from practitioners in practice

Let me be direct about what I see in the projects I work on.

The amateur has a collection of notes in Obsidian and when they decide to “use AI,” thinks about how to connect that vault to Claude or GPT. Sometimes they export everything as text and paste it in the prompt. Sometimes they find a plugin that promises to do this. The result is slow, expensive in tokens, and imprecise.

The practitioner builds a pipeline. The basic steps:

Ingestion: each document is broken into chunks of controlled size (typically 512 to 1024 tokens, with overlap to avoid losing context between chunks).
Embedding: each chunk is transformed into a vector by an embedding model (OpenAI’s text-embedding-3-small, or open-source models like nomic-embed-text).
Storage: vectors live in a vector database. pgvector in Postgres, Qdrant, Weaviate, or Chroma. The choice depends on scale and operational requirements.
Retrieval: at response time, the user’s query also becomes a vector. The N semantically closest chunks are retrieved (cosine similarity). Typically 5 to 20 chunks.
Reranking: optionally, a reranking model reorders the chunks by relevance before sending them to the LLM.
Generation: only then do the selected chunks enter the LLM’s context alongside the query.

The result: the model receives only what is relevant. The context window is used efficiently. The quality of the response is superior because the signal is clean.

This isn’t complicated to implement. LangChain, LlamaIndex, and the Anthropic SDK itself have abstractions for this. A working pipeline takes less than a day of work for someone with experience. What separates those who do it from those who don’t isn’t technical ability. It’s knowing the problem exists.

Where Obsidian doesn’t hurt

To be fair: Obsidian is excellent at what it actually does.

Local Markdown files mean you aren’t locked into any company. Ten years from now your files will still open in any text editor. The extensibility via plugins is robust and the community is active. The writing interface is quiet, distraction-free, and respects the flow of someone writing.

For capture, drafting, and personal organization of ideas, Obsidian is a solid choice. I use it to draft posts, annotate references, keep experiment logs. It works well for that.

The problem isn’t using Obsidian. It’s confusing Obsidian with a context layer for AI systems.

In the architecture of a well-built agent, Obsidian can be the capture front-end: where you write notes that are then ingested by the embeddings pipeline. Obsidian notes feed the vector store. The vector store feeds the agent. Obsidian doesn’t do retrieval. It does capture. These are different roles in the same chain.

The plugins change the game, but the credit goes elsewhere

Plugins like Smart Connections and Obsidian Copilot add semantic search to the vault. When these plugins work well, the experience changes: you ask in natural language and find relevant notes you wouldn’t have found by substring.

This is genuinely useful. But it’s important to understand where the usefulness comes from.

Smart Connections uses an embedding model, usually from OpenAI, to vectorize your notes. The semantic search it offers is OpenAI’s embedding service running on your files, exposed in an interface inside Obsidian. Obsidian is the container. The semantic work is done by the external embedding model.

When you understand this, the hierarchy becomes clear. Obsidian is the storage and interface layer. The embedding and language models are the cognitive layer. Mixing the two is what creates the illusion that the graph thinks.

The graph doesn’t think. The model thinks. The graph is the visualization of what you typed.

The next step

If you’re building with AI and still use Obsidian as your context layer, that’s not judgment. It’s a diagnosis. And the diagnosis has a short treatment.

Embeddings. Chunking. Retrieval. Reranking. These four concepts, well understood, separate the experience of “throw text in the prompt and hope” from “the agent finds exactly what it needs and responds with precision.”

The Obsidian vault can keep existing. But it stops being the final destination of your knowledge and becomes the entry point of a pipeline that actually retrieves, semantically and with relevance, what you need at the moment you need it.

Then the graph stops being decoration. And starts doing work.

References

Forte, T. (2022). Building a Second Brain. Atria Books.

Tietze, C. (2014). “The Collector’s Fallacy”. zettelkasten.de. zettelkasten.de/posts/collectors-fallacy

Ahrens, S. (2017). How to Take Smart Notes. Create Space.

Luhmann, N. (1992). “Kommunikation mit Zettelkästen”. In Universität als Milieu. Haux.

Liu, N. F. et al. (2023). “Lost in the Middle: How Language Models Use Long Contexts”. arXiv:2307.03172.

Tables are qualitative illustrations of the arguments, not empirical data. Scores reflect architectural capabilities documented in the literature on PKM and information retrieval systems.