Open source retrieval engine

Get a grip on
your codebase.

A retrieval engine that learns your data's vocabulary, remembers what works, and tells the AI when it doesn't have a good answer.

$ pip install getgrip Click to copy
0
Embedding models needed
0
Vector databases needed
0
API keys needed
97.5%
Accuracy (3,000 queries)

Three commands. Real results.

Install GRIP, ingest FastAPI from GitHub, search it. Under 5ms. No config. No model downloads.

terminal — grip demo
# Start GRIP
$ docker run -d -p 7878:8000 -v grip-data:/data griphub/grip:free
# Ingest FastAPI from GitHub — 2,424 files
$ curl -X POST localhost:7878/ingest -H "Content-Type: application/json" \
  -d '{"source": "https://github.com/tiangolo/fastapi"}'

{"status": "ok", "chunks": 4195, "files": 2424, "time_sec": 16.2}
# Search — watch co-occurrence expand your query automatically
$ curl "localhost:7878/search?q=OAuth+bearer+token&top_k=3"

query:      "OAuth bearer token"
expanded:   "OAuth bearer token authent secur valid header"
confidence: HIGH
latency:    2.1ms

───────────────────────────────────────────────────
① [32.4] fastapi/security/oauth2.py:370
  class OAuth2PasswordBearer(OAuth2):
    def __init__(self, tokenUrl: str, scheme_name: str ...

② [28.1] docs/en/docs/tutorial/security/oauth2-jwt.md:1
  OAuth2 with Password (and hashing), Bearer with JWT tokens ...

③ [24.7] fastapi/security/http.py:42
  class HTTPBearer(HTTPBase): ...
# GRIP learned "OAuth" → "authent secur valid header" from YOUR data.
# No embeddings. No vectors. Just learned vocabulary.

Retrieval that actually learns.

Most retrieval systems are stateless. Query in, results out, everything forgotten. GRIP remembers.

🔗

Co-occurrence Expansion

Search "auth" and GRIP finds "authentication", "OAuth", "middleware" — because it learned those terms co-occur in your data. No external model. No API call.

🧠

Auto-Remember

Queries that return good results are reinforced. The system boosts what works. Persistent across restarts. Gets better every time you use it.

💬

Session Context

Say "tell me more" and GRIP knows what you were just searching for. Retrieval with conversational memory. Not stateless per-query.

🎯

Confidence Scoring

Results scored HIGH / MEDIUM / LOW / NONE. The LLM knows when to say "I don't know" instead of hallucinating over bad results.

🔌

Plugin Architecture

Ingest from GitHub, local files, any source. Chunk code or prose. Query via CLI, API, or web UI. Ollama, OpenAI, and Anthropic LLM plugins built in.

✈️

Fully Offline

No cloud dependency. No network requirement. No telemetry. Works air-gapped. Your data stays on your machine. Always.

Drop into your existing stack.

GRIP is a JSON API on localhost. Swap it into any RAG pipeline in 5 minutes. No SDK. No client library. Just HTTP.

🦜 LangChain
🦙 LlamaIndex
🐍 Python
⬡ Node.js
🐳 Docker
🔧 Any HTTP client

Replace Pinecone, Chroma, or any vector store. Your existing prompts, chains, and agents work unchanged — GRIP just replaces the retrieval backend.

Drop your embedding API costs. Drop your vector database bill. Get vocabulary learning and confidence scoring that vector search doesn't have.

See integration examples →
# LangChain — replace Pinecone in 12 lines
class GRIPRetriever(BaseRetriever):
  grip_url: str = "http://localhost:7878"

  def _get_relevant_documents(self, query):
    r = requests.post(
      f"{self.grip_url}/search",
      json={"q": query, "top_k": 5}
    )
    return [
      Document(page_content=c["text"])
      for c in r.json()["results"]
    ]

Tested on real data.

3,000 auto-generated queries across three domains. No hand-curation. No cherry-picking. Industry-standard BEIR evaluation on 6 datasets.

0.58
NDCG@10 across 6 BEIR datasets, 2,771 queries.
Beats BM25 on all 6.
DatasetCorpusBM25GRIPDelta
FEVER5,416,5680.5090.808+0.299
HotpotQA5,233,3290.5950.741+0.146
SciFact5,1830.6650.682+0.017
NQ2,681,4680.2760.542+0.266
FiQA57,6380.2320.347+0.116
NFCorpus3,6330.3110.344+0.034

Accuracy at scale

1,000 queries per domain. Queries derived from corpus structure.

DomainCorpusQueriesCorrectAccuracy
Linux Kernel (code)188,209 chunks1,00098798.7%
Wikipedia (encyclopedia)11.2M chunks1,00098598.5%
Project Gutenberg (prose)173,817 chunks1,00095495.4%
Combined3,0002,92697.5%

What's in the box (and what's not)

Typical RAG StackGRIP
Embedding modelRequired (OpenAI, Cohere, etc.)Not needed
Vector databaseRequired (Pinecone, Weaviate, etc.)Not needed
API keysRequired for embeddings + LLMLLM only (optional)
Learns vocabularyNoYes — from your data
Remembers what worksNoYes — persistent
Session contextNo (stateless)Yes
Confidence scoringNoHIGH / MED / LOW / NONE
Works offlineRarelyFully air-gapped
Gets better with useNo (static index)Yes

One license. Unlimited users.

No per-seat fees. No per-query metering. Chunk count is the only variable. Start free — upgrade when you need more.

Free
$0
Try GRIP. No credit card. No time limit.
  • 10,000 chunks (~3,500 files)
  • All features included
  • Co-occurrence & remember
  • Sessions & confidence
  • CLI + API + Web UI
  • Learning resets on deletion
pip install getgrip
Personal
$499/year
For individual developers and small projects.
  • 100,000 chunks (~35K files)
  • Unlimited users
  • Learning persists permanently
  • CPU engine
  • CLI + API + Web UI
  • All plugins included
Buy Personal
Professional
$4,999/year
For organizations with serious retrieval needs.
  • 5,000,000 chunks (~1.8M files)
  • Unlimited users
  • Learning persists permanently
  • CPU engine
  • Priority ingestion
  • All plugins included
Buy Professional
How many chunks do I need? Your text files × 3 ≈ chunks. FastAPI (2,424 files) = 4,195 chunks. Linux kernel (70K+ files) = 188K chunks. Most projects fit comfortably in the free tier. See the sizing guide for details.
1.2ms at 28 million records. Single GPU.

An accelerated engine is available for organizations with large-scale retrieval needs.

Contact Enterprise