Open Source · v0.8.0 · MIT License

Search your
knowledge.
Instantly.

A local-first knowledge search CLI built for humans and AI agents. BM25 full-text search, vector embeddings, and LLM-powered Q&A — all in a single binary. Your data never leaves your machine.

28 stars Written in Go Works offline
qi — ~/projects
qi index ~/notes
✓ Indexed 312 documents in 0.8s
SHA-256 dedup: 18 duplicates removed
qi ask "how does chunking work?" -c notes
mode: hybrid · retrieving 8 chunks…
Chunking splits documents at semantic boundaries — headings, code fences, and paragraph breaks — using breakpoint scoring to keep chunks meaningful.
Sources: [notes/architecture.md:12] [notes/rag.md:7]
qi query "BM25 vector fusion" --mode hybrid --explain
RRF score: 0.94 · BM25: 0.81 · vec: 0.88
architecture.md abc123
rag-pipeline.md d4e5f6

Three layers.
One binary.

qi combines BM25 full-text search with optional vector embeddings — fused through Reciprocal Rank Fusion for results that are both precise and semantically aware. A single SQLite file holds everything. No servers, no sync, no subscriptions.

INPUT PROCESSING OUTPUT Documents .md .txt .pdf qi CLI Human interface AI Agent Claude Code plugin Chunker breakpoint scoring BM25 SQLite FTS5 Embedder Ollama / OpenAI RRF Fusion hybrid ranking LLM Q&A RAG + citations SQLite Search Results Hybrid Results Answer

Built different.

BM25

Blazing-fast full-text

BM25 via SQLite FTS5. No external search engine, no daemon, no configuration. Works out of the box the moment you run qi index.

vec

Flexible vector search

Embeddings stored and queried locally. Works with Ollama, LM Studio, llama.cpp, MLX, or OpenAI — swap providers in a single config line.

RRF

Hybrid fusion

Reciprocal Rank Fusion combines BM25 and vector rankings for results that are both lexically precise and semantically rich.

ask

LLM-powered Q&A

Ask questions in plain English. Get grounded answers with citations pointing back to your actual documents. No hallucination about sources.

SHA

Zero-dependency storage

A single SQLite file holds your entire index. Content-addressable blobs via SHA-256 eliminate duplicates automatically — no extra tooling.

✈ offline

Works offline, always

Vector search and Q&A are optional enhancements. BM25 search works entirely offline — no API keys, no internet, no surprises.

Every command.

qi init Create config and database
qi index [path] Index a directory or named collection
qi search <q> BM25 full-text search
qi query <q> Hybrid search (BM25 + vector)
qi ask <question> RAG-powered answer with citations
qi get <id> Retrieve document by 6-char hash ID
qi list List all named collections
qi delete <col> Delete a collection and its index
qi stats Show index statistics
qi doctor Health check

Search modes — --mode

lexical

BM25 full-text only. No embeddings required. Fastest option, great for keyword-heavy corpora.

hybrid

BM25 + vector search fused with RRF. Default. Best balance of precision and semantic recall.

deep

Hybrid + optional reranking pass. Highest quality. Use --explain to see scoring breakdown.

Up in thirty seconds.

Install via Homebrew, go install, or as a Claude Code plugin. Zero dependencies — qi ships as a single static binary.

Homebrew — macOS & Linux
$ brew tap itsmostafa/qi && brew install qi
go install
$ go install github.com/itsmostafa/qi@latest
Quickstart
$ qi init
$ qi index ~/notes
$ qi ask "how does X work?" -c notes

Claude Code Plugin MCP

Use qi directly from Claude Code — index your codebase and query it as an agent skill.

# Add the marketplace
/plugin marketplace add itsmostafa/qi

# Install the plugin
/plugin install qi

Your knowledge.
Your machine.

qi is MIT licensed, written in Go, and built for developers who believe their data belongs to them — not a vendor's cloud.