RAG & Hybrid Search

AllCodex uses a state-of-the-art hybrid search engine to query your lore. Rather than relying on simple keyword matching (which misses paraphrased ideas) or vector search (which can miss specific proper nouns), AllCodex fuses both methods.

The Search Pipeline

When you search for something in the Portal or when the AI retrieves lore context, AllKnower executes a four-stage search pipeline:

  Search Query (e.g., "dwarf king weapon")
             │
      ┌──────┴──────────────────────┐
      ▼                             ▼
  Vector Search (LanceDB)       FTS Search (SQLite BM25)
  Computes semantic matches    Finds literal text matches
      │                             │
      └──────┬──────────────────────┘
             ▼
  Reciprocal Rank Fusion (RRF)
  Merges & scores both result pools
             │
             ▼
  OpenRouter Reranker (Optional)
  Coarse-to-fine LLM reranking (Cohere Rerank 4)
             │
             ▼
      Final Search Results

1. Vector Search (LanceDB)

AllKnower maintains an embedded, in-process LanceDB database.

Embeddings: When you create or update a note, AllKnower chunkifies the note content and generates 4096-dimensional vectors using qwen/qwen3-embedding-8b via OpenRouter.
Semantic Retrieval: LanceDB performs vector similarity searches, letting you find relevant notes even if they use completely different words. For example, searching for “rulers of the elven wood” will retrieve notes mentioning “leaders of the sylvan canopy” without needing exact keyword matches.

2. Full-Text Keyword Search (SQLite BM25)

To ensure proper nouns (like character names, unique spells, or specific item titles) are never lost due to vector approximations, AllKnower queries AllCodex Core’s SQLite Full-Text Search (FTS5) engine. This uses the BM25 algorithm to score notes based on literal term frequency and inverse document frequency.

3. Reciprocal Rank Fusion (RRF)

The candidate sets from both LanceDB and SQLite are merged using Reciprocal Rank Fusion (RRF). RRF scores candidates based on their rank in each separate search list rather than their raw scores:

RRF\_Score(d) = \sum_{m \in M} \frac{1}{k + r_m(d)}

Where

r_m(d)

is the rank of document

d

in system

m

, and

k

is a constant (typically 60) used to mitigate the impact of low-ranked outliers. RRF consistently outperforms either search method alone.

4. LLM Reranking (Optional)

For RAG context generation (feeding the AI grimoire history during Consistency scans), the fused RRF candidate list is sent to OpenRouter’s native /rerank endpoint using cohere/rerank-4-pro. The reranker scores how closely the contents of the top 20 notes answer the specific AI prompt, returning the top 5 most relevant documents to be injected into the LLM context window. This saves token costs and prevents LLM “lost in the middle” retrieval failures.

​The Search Pipeline

​1. Vector Search (LanceDB)

​2. Full-Text Keyword Search (SQLite BM25)

​3. Reciprocal Rank Fusion (RRF)

​4. LLM Reranking (Optional)

The Search Pipeline

1. Vector Search (LanceDB)

2. Full-Text Keyword Search (SQLite BM25)

3. Reciprocal Rank Fusion (RRF)

4. LLM Reranking (Optional)