Teach Your Zettlekasten Notes to Talk Back

Article summary

The Basic Idea
How It Works
The Interface
What It's Like in Practice
What Came Out of Building This
Try It

Years ago, I started a Zettlekasten-style collection of notes using Emacs’ org-roam package. A Zettlekasten system is typically comprised of hundreds of linked notes covering all topics of interest to the maintainer (if you’re interested in learning more on such systems, I recommend the book How To Take Smart Notes by Sönke Ahrens). It can be a great system… until you need to actually find something.

The problem isn’t search. Org-roam has search. The problem is that search requires you to remember how you phrased something. You know you wrote about “the Result objects pattern” in TypeScript three months ago, but did you file it under “coding patterns” or “TypeScript”? Was the note titled something completely different because you were thinking about it in the context of a specific project? You end up grep-ing through your own brain trying to reconstruct your past self’s vocabulary. At some point, the notes become an archive you trust but can’t use.

The Basic Idea

Retrieval-Augmented Generation (RAG) is a pattern that’s gotten a lot of attention in the context of knowledge bases and support bots. The idea is simple: instead of asking an LLM to answer from its training data, you first seed it with the relevant documents, then ask the LLM to answer using only those. The LLM handles understanding and synthesis, and the search handles grounding.

It occurred to me that this was exactly what I needed for my notes to better surface my own thinking. The output should read like a summary of things I already know, but couldn’t immediately put my hands on.

So I built a little weekend project called Brainpicker.

How It Works

The pipeline has a few stages, and most of the interesting problems are in the data preparation, not the AI.

Parsing. Org files are structured plain text, but “structured” does a lot of work there. Headings, property drawers, tags, backlinks, todo states, etc. A parser needs to handle all of it gracefully and pull out the metadata that will be useful later (note titles, tags, org-roam IDs).

Chunking. You can’t embed an entire note as a single unit. Long notes lose specificity; the embedding ends up representing some average of all the ideas in the note rather than any particular one. Brainpicker splits notes into overlapping chunks of about 800 characters, with a 100-character overlap between adjacent chunks. The overlap exists because relevant context often sits right at a boundary, and you don’t want to slice a key sentence in half.

Embeddings. Each chunk gets converted into a vector using OpenAI’s text-embedding-3-small model. Vectors are stored in Qdrant, a purpose-built vector database, which can find the chunks most similar to a query vector with low latency even across tens of thousands of chunks.

To keep things practical, the indexer tracks file modification times. Re-indexing only processes files that have changed since the last run, which makes it fast enough to run routinely.

Querying. When you ask a question, the same embedding model converts your question into a vector. Qdrant returns the five most similar chunks above a minimum similarity threshold. Those chunks, along with your original question, go to Claude, which synthesizes an answer and cites which notes it drew from.

The whole round-trip is fast — usually a couple of seconds.

The Interface

I built a small React UI that looks like a chat window. You ask a question, you get an answer, and below the answer you can see which notes were used. There’s also a CLI for those who prefer to stay in the terminal.

The server side is a Hono HTTP server running on Bun. It’s a lightweight stack that turned out to be a good fit: Bun’s startup time is fast enough that the CLI feels responsive, and Hono’s API surface is small enough that the server code stays easy to read.

What It’s Like in Practice

The experience I keep coming back to is asking about decisions I made months ago and having the system surface the reasoning I wrote down at the time, in my own words. There’s something useful about that. Not just finding the conclusion, but the thinking that led to it.

It also surfaces connections I’d forgotten I made. Ask about one topic and you’ll sometimes get a chunk from a note about something adjacent, because the semantic similarity is there even if the vocabulary is different. That’s the part that’s hard to get from keyword search.

The limitations are real though. The quality of the output is a direct function of the quality of the notes. Sparse notes or notes that lean heavily on shorthand produce sparse, unhelpful answers. And chunking is a blunt instrument; very long notes with multiple distinct topics sometimes get retrieved for the wrong one.

What Came Out of Building This

The AI integration was the easy part. The Anthropic and OpenAI SDKs are well-documented, and the RAG pattern itself is well-understood at this point. The time went into parsing and chunking: getting the org file parser to handle edge cases correctly, tuning chunk sizes, deciding what metadata was worth storing.

The other thing worth noting is that Qdrant handles a lot of complexity that would otherwise be painful to build: approximate nearest-neighbor search, filtering by metadata, persistence. Running it locally in Docker kept the whole system self-contained.

The pattern (embed a local corpus, search it semantically, generate answers grounded in it) is general enough that it’s easy to see where else it applies. Team wikis, codebases, customer conversation history. Anywhere you have a body of text that accumulates faster than people can search it.

Try It

Brainpicker is open source at github.com/apisandipas/brainpicker. To run it, you’ll need org-roam notes, Docker for Qdrant, and API keys for OpenAI and Anthropic. The README has setup instructions.

A few things on the roadmap: automatic re-indexing triggered by file saves, chunking that’s aware of org heading structure rather than treating everything as a flat character stream, and support for querying across multiple separate collections.

If you’re sitting on years of notes and finding them harder to use as they grow, it’s worth trying.

Teach Your Notes to Talk Back

Article summary

The Basic Idea

How It Works

The Interface

What It’s Like in Practice

What Came Out of Building This

Try It

Join the conversation Cancel reply

Tell Us About Your Project

Article summary

The Basic Idea

How It Works

The Interface

What It’s Like in Practice

What Came Out of Building This

Try It

Related Posts

How I’m Going Analog at My Tech Job: My Favorite Offline Tools

Never Missing a Birthday Again with Sheets + Slack

Upgrade Your Note-Taking Skills From The Classroom To The Office

Keep up with our latest posts.

Join the conversation Cancel reply

Tell Us About Your Project