On this page

voracle — Designing and Implementing a Semantic Search MCP/CLI Tool for Obsidian Vaults

Design and implementation of voracle, a semantic search MCP/CLI tool for Obsidian vaults. Covers ONNX embedding + ColBERT reranking, multiple vault root support, differential caching, and LLM conversation ingestion (distil).

These articles use AI-generated summaries of Obsidian notes originally kept as technical memos.

English translations are produced with AI assistance.

Conclusion

Designed and implemented voracle, a semantic search MCP/CLI tool for Obsidian vaults. The final tool has the following structure:

MCP server: stdio JSON-RPC 2.0, exposing search and read tools
CLI commands: -ingest (index building), -list / -rg (semantic search), -distil (external LLM conversation ingestion)
Multiple vault roots: specified via --root flags or VORACLE_PATH environment variable (colon-separated)
Embedding cache: differential ingest via .embeddings.jsonl. Unchanged notes are restored from cache
Model placement: ONNX models stored in ~/.local/share/voracle/models/

  # Build index (first run embeds all chunks, subsequent runs only diff)
voracle -ingest

# Semantic search
voracle -list "kubernetes networking"

# Start MCP server
voracle --server

# Ingest LLM conversation (from clipboard)
voracle -distil c

Background

valut-oracle is a Rust binary providing semantic search and reranking for Obsidian vaults, running ONNX Runtime embedding (pplx-embed-v1-0.6b) and ColBERT reranking (mxbai-edge-colbert) locally via MCP (stdio JSON-RPC 2.0).

It sits alongside other Rust MCP tools like pathfinder and shelpa (now filesystem), serving as the dedicated vault search tool in the lineup.

The motivation came from articles, design notes, and domain files piling up in Obsidian beyond what I could keep in my head. I wanted something that could find the right chunks from vague “that thing, you know” queries, and since I was already using a distil command to extract LLM conversations, it made sense to consolidate everything into one tool.

Multiple vault root support

Changed from a single vault_path: PathBuf to roots: Vec<PathBuf>. Multiple vault roots can be specified with --root flags, and VaultResolver recursively scans each root to consolidate notes.

  voracle --root /path/to/vault1 --root /path/to/vault2 --server

Also configurable via the VORACLE_PATH environment variable, expanded as colon-separated paths. Resolution priority: --root flags > VORACLE_PATH > error.

  export VORACLE_PATH=/Volumes/VALUT/obsidian/docs:/Volumes/VALUT/obsidian/articles

Embedding cache

The first ingest requires embedding all chunks across all notes via the ONNX model, which is heavy for vaults with 500+ notes. To speed up subsequent runs, a differential cache using .embeddings.jsonl was implemented.

The cache is in JSONL format, placed per vault root, with one line per note. Each line contains note_id, content_hash (DefaultHasher), mtime_secs, and the embedding vectors for each chunk. The filename uses a dot prefix to remain invisible in the Obsidian UI.

Lookup is two-stage. If mtime matches, it’s a cache hit. If mtime differs, the file content hash is compared, and only if the hash also differs is re-embedding triggered. This covers cases where only the timestamp changed (e.g., touch or copy) while allowing most cases to be resolved quickly from metadata alone.

voracle -ingest terminal output — voracle -ingest — embedding notes one by one

The -f flag forces a full re-embed, ignoring the cache. A cached_hits field was added to IngestResult to report how many notes were restored from cache.

Distil command integration

The distil shell script functionality from agent-gateway was integrated into voracle. This command ingests external LLM conversation logs (Claude CLI, Codex CLI, Gemini CLI) into the vault, originally by hitting agent-gateway’s POST /v1/obsidian/ingest endpoint, but the gateway-side handler had already been removed.

Three input sources are supported: inline as the second argument, piped from stdin, or from the clipboard via pbpaste if neither is provided. Model aliases map c/claude → claude-opus-4-6, x/codex → codex-5.3, g/gemini → gemini-3.1-pro.

  voracle -distil c              # from clipboard
voracle -distil claude "text"  # inline
cat file.md | voracle -distil x  # stdin

Output goes to _notes/llm-completions/{model_dir}/{correlation_id}.md in the first vault root. Frontmatter includes date, status: raw, correlation_id, and model. If the index is already built, a semantic search on the first 100 words is used to populate related note IDs in the related field.

MCP tool cleanup

Removing MCP ingest

The MCP server initially exposed three tools: search, read, and ingest. However, ingest was no longer needed via MCP since it became a CLI-only prerequisite. In fact, the handle_tool_call routing already lacked a dispatch for ingest—only the tool definition remained, creating an inconsistency.

The ingest function body was removed from tools.rs and its entry deleted from tool_definitions() in server.rs. The MCP server now exposes only search and read.

Each search result includes a read_tool_call that an LLM can invoke directly. Pages are split at roughly 1000 characters, keeping context consumption under control.

The final CLI interface looks like this:

  usage: voracle [--root <path>]... <command> [args]

options:
  --root <path>                   vault root directory (repeatable)
                                  falls back to VORACLE_PATH (colon-separated)

index:
  -ingest [-f]                    build index (required before search)

search (loads .index.bin):
  -list  <query> [-k N]           semantic file list, default top 3
  -rg    <query> [-k N]           grep-like semantic search with context
  -read  <note_id>                print note content to stdout

tools:
  -embed <text>                   print embedding vector (TSV)
  -rerank <query> <file>...       rerank files by ColBERT MaxSim score
  -status                         show index status
  -distil <model> [content]       ingest LLM conversation into vault
                                  model: c(laude), x(codex), g(emini)
                                  reads from: arg, stdin, or clipboard

server:
  --server                        stdio MCP server (JSON-RPC 2.0)

-list and -rg both load .index.bin for semantic search, but -list outputs file paths and scores while -rg displays matched chunks with surrounding context. -read prints full note content to stdout as a CLI-only command, separate from the MCP read tool. -embed and -rerank are low-level inference tools for pipeline debugging and integration with other tools.

Renaming to voracle

The command name valut-oracle was too long with a hyphen in the middle, making it cumbersome for daily use. Everything was shortened to voracle—package name, binary name, and environment variables.

Cargo.toml: package name changed to voracle
src/cli/mod.rs: environment variable changed from VALUT_ORACLE_PATH to VORACLE_PATH, all usage messages updated
src/config.rs: model directory changed to ~/.local/share/voracle/models/
src/main.rs: tracing filter changed to voracle=info
src/mcp/server.rs: ServerInfo.name changed to voracle
All documentation (README.md, 5 files in docs/) unified to the new name

The model directory was standardized to ~/.local/share/voracle/models/, respecting XDG_DATA_HOME if set. Unlike pathfinder where ~32 MB models can be embedded via include_bytes!, 700 MB makes that impractical, so external placement was the only option. Models were removed from the repository’s entire history via git filter-repo.

From shelpa to filesystem — Complete Redesign of a Rust MCP Filesystem Server

A record of renaming the local …

agent-gateway v3 Redesign — Splitting the knowledge Domain and Integrating MLflow/Obsidian

Redesign record of splitting …