voracle — Designing and Implementing a Semantic Search MCP/CLI Tool for Obsidian Vaults
Design and implementation of voracle, a semantic search MCP/CLI tool for Obsidian vaults. Covers ONNX embedding + ColBERT reranking, multiple vault root support, differential caching, and LLM conversation ingestion (distil).
Conclusion
Designed and implemented voracle, a semantic search MCP/CLI tool for Obsidian vaults. The final tool has the following structure:
- MCP server: stdio JSON-RPC 2.0, exposing
searchandreadtools - CLI commands:
-ingest(index building),-list/-rg(semantic search),-distil(external LLM conversation ingestion) - Multiple vault roots: specified via
--rootflags orVORACLE_PATHenvironment variable (colon-separated) - Embedding cache: differential ingest via
.embeddings.jsonl. Unchanged notes are restored from cache - Model placement: ONNX models stored in
~/.local/share/voracle/models/
# Build index (first run embeds all chunks, subsequent runs only diff)
voracle -ingest
# Semantic search
voracle -list "kubernetes networking"
# Start MCP server
voracle --server
# Ingest LLM conversation (from clipboard)
voracle -distil c
Background
valut-oracle is a Rust binary providing semantic search and reranking for Obsidian vaults, running ONNX Runtime embedding (pplx-embed-v1-0.6b) and ColBERT reranking (mxbai-edge-colbert) locally via MCP (stdio JSON-RPC 2.0).
It sits alongside other Rust MCP tools like pathfinder and shelpa (now filesystem), serving as the dedicated vault search tool in the lineup.
The motivation came from articles, design notes, and domain files piling up in Obsidian beyond what I could keep in my head. I wanted something that could find the right chunks from vague “that thing, you know” queries, and since I was already using a distil command to extract LLM conversations, it made sense to consolidate everything into one tool.
Multiple vault root support
Changed from a single vault_path: PathBuf to roots: Vec<PathBuf>. Multiple vault roots can be specified with --root flags, and VaultResolver recursively scans each root to consolidate notes.
voracle --root /path/to/vault1 --root /path/to/vault2 --server
Also configurable via the VORACLE_PATH environment variable, expanded as colon-separated paths. Resolution priority: --root flags > VORACLE_PATH > error.
export VORACLE_PATH=/Volumes/VALUT/obsidian/docs:/Volumes/VALUT/obsidian/articles
Embedding cache
The first ingest requires embedding all chunks across all notes via the ONNX model, which is heavy for vaults with 500+ notes. To speed up subsequent runs, a differential cache using .embeddings.jsonl was implemented.
The cache is in JSONL format, placed per vault root, with one line per note. Each line contains note_id, content_hash (DefaultHasher), mtime_secs, and the embedding vectors for each chunk. The filename uses a dot prefix to remain invisible in the Obsidian UI.
Lookup is two-stage. If mtime matches, it’s a cache hit. If mtime differs, the file content hash is compared, and only if the hash also differs is re-embedding triggered. This covers cases where only the timestamp changed (e.g., touch or copy) while allowing most cases to be resolved quickly from metadata alone.

The -f flag forces a full re-embed, ignoring the cache. A cached_hits field was added to IngestResult to report how many notes were restored from cache.
Distil command integration
The distil shell script functionality from agent-gateway was integrated into voracle. This command ingests external LLM conversation logs (Claude CLI, Codex CLI, Gemini CLI) into the vault, originally by hitting agent-gateway’s POST /v1/obsidian/ingest endpoint, but the gateway-side handler had already been removed.
Three input sources are supported: inline as the second argument, piped from stdin, or from the clipboard via pbpaste if neither is provided. Model aliases map c/claude → claude-opus-4-6, x/codex → codex-5.3, g/gemini → gemini-3.1-pro.
voracle -distil c # from clipboard
voracle -distil claude "text" # inline
cat file.md | voracle -distil x # stdin
Output goes to _notes/llm-completions/{model_dir}/{correlation_id}.md in the first vault root. Frontmatter includes date, status: raw, correlation_id, and model. If the index is already built, a semantic search on the first 100 words is used to populate related note IDs in the related field.
MCP tool cleanup
Removing MCP ingest
The MCP server initially exposed three tools: search, read, and ingest. However, ingest was no longer needed via MCP since it became a CLI-only prerequisite. In fact, the handle_tool_call routing already lacked a dispatch for ingest—only the tool definition remained, creating an inconsistency.
The ingest function body was removed from tools.rs and its entry deleted from tool_definitions() in server.rs. The MCP server now exposes only search and read.
Each search result includes a read_tool_call that an LLM can invoke directly. Pages are split at roughly 1000 characters, keeping context consumption under control.
The final CLI interface looks like this:
usage: voracle [--root <path>]... <command> [args]
options:
--root <path> vault root directory (repeatable)
falls back to VORACLE_PATH (colon-separated)
index:
-ingest [-f] build index (required before search)
search (loads .index.bin):
-list <query> [-k N] semantic file list, default top 3
-rg <query> [-k N] grep-like semantic search with context
-read <note_id> print note content to stdout
tools:
-embed <text> print embedding vector (TSV)
-rerank <query> <file>... rerank files by ColBERT MaxSim score
-status show index status
-distil <model> [content] ingest LLM conversation into vault
model: c(laude), x(codex), g(emini)
reads from: arg, stdin, or clipboard
server:
--server stdio MCP server (JSON-RPC 2.0)
-list and -rg both load .index.bin for semantic search, but -list outputs file paths and scores while -rg displays matched chunks with surrounding context. -read prints full note content to stdout as a CLI-only command, separate from the MCP read tool. -embed and -rerank are low-level inference tools for pipeline debugging and integration with other tools.
Renaming to voracle
The command name valut-oracle was too long with a hyphen in the middle, making it cumbersome for daily use. Everything was shortened to voracle—package name, binary name, and environment variables.
Cargo.toml: package name changed tovoraclesrc/cli/mod.rs: environment variable changed fromVALUT_ORACLE_PATHtoVORACLE_PATH, all usage messages updatedsrc/config.rs: model directory changed to~/.local/share/voracle/models/src/main.rs: tracing filter changed tovoracle=infosrc/mcp/server.rs:ServerInfo.namechanged tovoracle- All documentation (README.md, 5 files in docs/) unified to the new name
The model directory was standardized to ~/.local/share/voracle/models/, respecting XDG_DATA_HOME if set. Unlike pathfinder where ~32 MB models can be embedded via include_bytes!, 700 MB makes that impractical, so external placement was the only option. Models were removed from the repository’s entire history via git filter-repo.
