grepai for local semantic search
Table of Contents
yoanbernabeu/grepai indexes your Git repository with a local embedding model, analyzes the call tree and defined symbols and provides a CLI and MCP interface for searching. The idea is very cool.
Testing it in a CPU-only virtualized system1 with the english-only nomic-embed-text model, a repo of ~6k LoC index took ~30min, and the searches takes ~0.1s. All through the local model. Pretty fair.
Analysis
Below an overview analysis
grepai search
The search results aren't really useful in CLI because:
- Default text output mode has no syntax highlighting and is not clean (prints line numbers, own-format headers and file mention).
- No config of
--after-contextand--before-contextlikegrep. - JSON output not really readable
- Not really integrated in the editor (I use
ripgrepviaprojectile.el, wrapped by Doom Emacs. So it's a keystroke of distance:SPC /)
$ grepai search 'api caller' | head -n21
Found 10 results for: "api caller"
─── Result 1 (score: 0.5165) ───
File: src/apiClient.ts:1-36
1 │ import type { Event } from "./event.types";
2 │
3 │ /**
4 │ * The base URL of the API to which events are sent.
5 │ */
6 │ const API_URL = process.env.API_URL || "";
7 │
8 │ /**
9 │ * The API token used for authentication.
10 │ */
11 │ const API_TOKEN = process.env.API_TOKEN || "";
12 │
13 │ /**
14 │ * Sends an Event object to a remote API.
15 │ *
│ ... (21 more lines)
grepai mcp-serve
The MCP server is another story. LLM was really able to find the results it expected and improved the search speed and token usage.
It doesn't fits well when LLM is understanding the repo for the first time, but nice to find specific known symbols, logic or implementation.
I need more tests to take deeper conclusions.
Overall
I'll test it during a longer period and see if running a ollama server in my
qube for it is worth.
Sorry, no conclusions yet, but I encourage you to try it yourself and take yours.
Footnotes:
Running it in a QubesOS VM (qube)