Problem Statement

When using LLMs in local development environments for tool invocations (file read, write, list, search), path-related errors occur frequently—particularly:

  • LLM typos: internal/config/ mistyped as internal/config.go → ENOENT
  • Ambiguous filenames: 8 client.go files exist; LLM cannot determine which one is intended
  • Path format mismatches: mixing relative and absolute paths, failing to meet tool expectations

These errors interrupt development workflow, requiring engineers to manually correct and resubmit commands. With local LLMs in particular, repeated tool retry failures cause wasteful context consumption — error messages and retry loops accumulate, eating into the limited context window. This is not a question of model quality, nor is it project-specific. If the tool layer can absorb path resolution as a common concern, it directly contributes to context compression. pathfinder was built on this premise.

Objective

Build a system that automatically resolves paths and retries on error, without relying on LLM path generation. This achieves:

  • Reduced manual intervention frequency
  • Higher tool call success rates
  • Stable operation independent of LLM “accuracy”

Approach

1. Limit LLM Role to Selection

Instead of asking the LLM to generate paths, restrict it to making final decisions. Process flow:

  Failed Tool Call → Path Resolve Tool → Ranked Candidates (Top-K) 
                                            ↓
                                  (LLM Selection OR Auto Retry)
  

LLM provides only intent_text—a brief explanation of intent (e.g., “modifying reranker client settings”). This is sufficient; path generation is handled deterministically.

2. Candidate Generation and Re-ranking

  • Candidate generation (fast): Given failed path + intent_text, extract filesystem candidates via fuzzy matching (~200 candidates in <50ms)
  • Re-ranking (precise): Order candidates using ColBERT maxsim, capturing semantic alignment between path and intent rather than lexical similarity alone

Example: intent_text “reranker config modification” paired with path internal/infra/*/client.go ranks internal/infra/reranker/client.go higher than internal/infra/vllm/client.go through semantic understanding.

3. Automatic Retry with Verified Existence

Every returned candidate is verified to exist via stat() before inclusion. No hallucinated paths. The MCP tool automatically iterates through Top-K candidates, eliminating manual trial-and-error.

Implementation Architecture

Rust + ONNX for High-Speed Resolution

  • Language: Rust (low latency, memory safety)
  • Re-ranking Model: ColBERT (answerai-colbert-small, INT8 quantized)
  • Inference Engine: ONNX Runtime (CPU-only, 9ms average with INT8)
  • Scoring: 12 lexical features + 2 history correlation + 1 neural re-ranking

Implemented as MCP (Model Context Protocol) server, deployed in Claude Code / Zed environments.

Key Components

A. Filesystem Index

At startup, index all files/directories under root:

  • Absolute path, relative path, basename, extension
  • Tokenized parent directory tokens (“internal/config” → [“internal”, “config”])
  • Optional: metadata from package.json, Cargo.toml

This index enables rapid candidate generation.

B. Candidate Generation Phase

Given failed path + intent_text, generate candidates via:

  • Basename matching: “config.go” finds all config.go files
  • Fuzzy matching: typo tolerance (“clinet.go” → “client.go”)
  • Token-based filtering: intent_text keywords (“reranker”) narrow parent directories

Output: ~200 candidates extracted in <50ms.

C. Re-ranking with ColBERT maxsim

Re-rank 200 candidates by semantic relevance:

  intent_text = "reranker client settings modification"
candidates = [
  "internal/infra/reranker/client.go",     ← correct (high score)
  "internal/infra/openaihttp/client.go",   ← related but not reranker
  "internal/infra/vllm/client.go",         ← similar, lower score
  ...
]
  

ColBERT’s token-level max-similarity captures semantic path-intent alignment beyond string distance, learned from large-scale embeddings, positioning correct path at top-1 or top-3.

D. Retry Logic

MCP tool_retry_with_resolve endpoint:

  • Accepts original tool call (read, list, stat) + failure details
  • Retrieves Top-N candidates via path_resolve
  • Iterates candidates automatically (strategy: best_first or score_desc)
  • Returns original tool result on first success
  • Returns status=“all_failed” if exhausted

Verification and Refinement

Benchmark Scenarios

Test suite covering:

  1. Single filename ambiguity (8 client.go files exist)
  2. Directory affinity (infer correct file from recent editing context)
  3. Typo tolerance (“clinet.go” → “client.go”)
  4. Path prefix omission (“infra/reranker/client.go” → “internal/infra/reranker/client.go”)

Example:

  Query: "client.go"
Initial: 8 candidates all score 120.0 (alphabetically ordered → "llamacpp/client.go" selected)
With history: "recent edits: reranker" context → "reranker/client.go" ranks first
  

Accuracy Improvements

The core challenge was selecting the correct one from 8 identical-score candidates.

Improvements:

  • History correlation: inject recently-edited directory into context
  • Exact path matching: benchmark changed from endswith() to full path equality
  • Query augmentation: intent_text explicitly includes directory hints

Result: Micro-scoring from history correlation breaks ties among identical-score candidates, correctly identifying intended path.

Technical Implementation Details

Rust Project Structure

  pathfinder/
├── src/
│   ├── bin/resolve_inference.rs      # MCP server core
│   ├── inference.rs                 # ONNX inference (ColBERT)
│   └── main.rs
├── Cargo.toml                       # ort, tokenizers, axum deps
└── sampling/                        # test Go/Rust codebases
  

resolve_inference.rs Core Functions

  1. Service struct: manages file index, embedding cache, Embedder
  2. Index generation: filesystem scan at startup, creates IndexEntry for each file
  3. Candidate generation: fuzzy match + token-based filtering
  4. Re-ranking: ColBERT maxsim scoring
  5. Existence verification: stat() before returning paths

inference.rs

  pub async fn encode_text_to_token_vectors(
    text: &str,
    embedder: &Embedder,
) -> Result<Vec<Vec<f32>>, String>;

pub fn maxsim(
    query_vectors: &[Vec<f32>],
    candidate_vectors: &[Vec<f32>],
) -> f32;
  

Token-level max-similarity: for each query token, compute max cosine similarity against all candidate tokens, sum results.

Performance Profile (Measured)

  • Index generation: ~1-2s (3,261 files / 628 directories)
  • Path resolution (INT8): 9.07ms average (including re-ranking)
  • Accuracy: Full monorepo benchmark 87.3% (48 correct out of 55 cases). On the ambiguous bare filename subset only, 85.0% with history correlation (17 out of 20 cases)
  • Retry success: 1-2 attempts average (90%+ success within Top-5)

Production Considerations

Embedding Cache Design

Cache embedding results to accelerate repeated queries:

  emb_cache: HashMap<String, Vec<Vec<f32>>>
  

Challenge: invalidate cache on filesystem changes. Solution: MCP notification mechanism to refresh on root updates.

History Correlation Usage

Inject recent editing context to boost precision:

  tool_retry_with_resolve(
  failed_path: "client.go",
  op: "read",
  intent_text: "modifying reranker client configuration",
  root_hint: "internal/infra/reranker"  # recent edits
)
  

This context, combined with semantic ranking, disambiguates among multiple same-score candidates.

Structured Error Responses

Return actionable error messages:

  {
  "error_kind": "Ambiguous",
  "path": "client.go",
  "candidates": [
    { "path": "/Users/.../internal/infra/reranker/client.go", "score": 95.2 },
    { "path": "/Users/.../internal/infra/vllm/client.go", "score": 94.8 },
    ...
  ],
  "next_question": "Which client.go: reranker or vllm?"
}
  

When multiple candidates tie, LLM can clarify via follow-up question rather than guessing.

Post-Implementation Verification and Improvement

Based on this initial design, 7 improvement iterations were conducted. Details recorded in pathfinder Optimization Journey.

Key improvements:

  • Model selection: Evaluated 3 ColBERT models, all produced identical accuracy (87.3%). Adopted answerai-colbert-small INT8 (9ms)
  • History correlation: 5-entry ring buffer improved ambiguous bare filename resolution from 35% → 85% (+50pp)
  • Filesystem monitoring: Real-time index updates via notify crate
  • Orphan process detection: Automatic shutdown when parent MCP client disappears

Remaining Challenges

  • Wrong extension (e.g., matcher.pymatcher.rs) accuracy at 57%
  • Ambiguous bare filenames: 75% without history context
  • Edit-distance scoring for improved typo correction
  • Tool execution proxy safety: tool_retry_with_resolve automatically re-executes tools after path resolution. While convenient, if the resolved path points to an unintended file, write operations could have undesired effects — further safety considerations are needed
  • Pluggable model entry point: Currently fixed to answerai-colbert-small. A pluggable interface for swapping in alternative re-ranking models would allow tuning per use case and project scale

CLI Help Output

  [email protected] ~/Development/loftllc-web % pathfinder-mcp -h
pathfinder-mcp — deterministic path resolution MCP server

USAGE
    pathfinder [OPTIONS]

OPTIONS
    --root <PATH>     Add a project root directory to watch and index.
                      May be specified multiple times.  Defaults to $PWD.
    -h, --help        Print this help message and exit.

DESCRIPTION
    An MCP (Model Context Protocol) server that resolves ENOENT / NotFound
    path errors for AI coding agents.  It builds an in-memory path index of
    configured root directories and uses fuzzy matching combined with ColBERT
    MaxSim re-ranking (when an ONNX model is available) to resolve incorrect
    paths to their most likely existing counterparts.

    Communication is via JSON-RPC over stdin/stdout.  Evaluation metrics are
    written to stderr as JSON lines (redirect with 2>metrics.jsonl).

    The server runs a stdin reader thread with periodic orphan detection and
    exits automatically when the parent MCP client process disappears.

ENVIRONMENT VARIABLES
    RESOLVE_MODEL_PRECISION  Model precision: "int8" (default), "fp16", or "fp32".
    RESOLVE_MODEL_DIR        Model directory containing model_*.onnx + tokenizer.json.
    RESOLVE_TOPK             Minimum topk value (default 10).
    INCLUDE_DIRS             Comma-separated directory names to force-include.

MCP TOOLS
    path_resolve             Resolve a failed file path to the best match.
    tool_retry_with_resolve  Resolve and automatically retry the operation.
    roots_list               Return configured root directories.
    reindex_paths            Force a full rebuild of the path index.

MCP CLIENT CONFIGURATION (Claude Code)
    "pathfinder-mcp": {
      "command": "pathfinder",
      "args": ["--root", "${workspaceFolder}"]
    }
  

Conclusion

Compensate for LLM path generation weakness through filesystem indexing + semantic re-ranking + automatic retry. Result: largely automated error recovery, freeing developers from manual path correction and allowing focus on higher-level directives.

Rust + ONNX implementation achieves both low latency and high precision, making production deployment viable. For detailed optimization process, see pathfinder Optimization Journey.