Hybrid Retrieval with Graph Filters: FalkorDB + Qdrant
A production blueprint for combining graph traversal and vector search to improve recall and precision in document retrieval.
When I first started building retrieval pipelines, pure vector search seemed like it could handle everything. That illusion shattered the day a user asked, “Which vendor has sent us the most invoices this quarter?” Vector search dutifully returned chunks about individual invoices — semantically relevant, technically correct, utterly useless for answering the actual question. That question required traversing entity relationships: linking invoice documents to vendor entities, counting across the graph. That was my aha moment — the realization that semantic similarity and structural connectivity are fundamentally different retrieval axes, and the best systems need both. GraphRAG hybrid retrieval combines the best of both worlds.
Vector search finds semantically similar documents. Graph traversal finds structurally connected ones. Combining both in a hybrid retrieval pipeline gives you the precision of graph relationships with the flexibility of dense embeddings.
[] — Microsoft ResearchThe Limitation of Vector-Only Search
Vector search excels at “find documents about X.” But it struggles with relational queries:
- “Show me all lab results for the same patient as this report”
- “What documents reference the same organization?”
- “Find related documents within two hops of this entity”
These require structural knowledge — which entities appear in which documents, and how they relate. That’s where a knowledge graph comes in.
[] — Schneider et al.Architecture
Query
|--> Vector Search (Qdrant).....Top-K by similarity
|
|--> Entity Extraction (NER)
| |
|--> Graph Traversal (FalkorDB)..Connected documents
|
v
RRF Fusion --> Final ranked results
The architecture follows a pattern well-documented in graph-based reasoning literature, where graph structure augments the retrieval process rather than replacing it.
[] — Vashishth et al.Graph Schema
A minimal schema for document-entity relationships in FalkorDB:
// Nodes
(:Document {id, title, created_at})
(:Entity {text, label, normalized_text})
(:User {id})
// Relationships
(:Document)-[:CONTAINS]->(:Entity)
(:Entity)-[:SAME_AS]->(:Entity)
(:User)-[:OWNS]->(:Document)
[]
— FalkorDB
Step 1: Extract Entities and Build the Graph
When a document is processed, NER extracts entities and creates graph relationships:
public class GraphSyncService
{
private readonly IGraphClient _graph;
public async Task SyncDocumentAsync(
string docId, IReadOnlyList<ExtractedEntity> entities,
CancellationToken ct)
{
// Upsert document node
await _graph.ExecuteAsync(
"MERGE (d:Document {id: $id})",
new { id = docId }, ct);
foreach (var entity in entities)
{
var normalized = entity.Text.ToLowerInvariant().Trim();
// Upsert entity and link to document
await _graph.ExecuteAsync(@"
MERGE (e:Entity {normalized_text: $normalized})
ON CREATE SET e.text = $text, e.label = $label
WITH e
MATCH (d:Document {id: $docId})
MERGE (d)-[:CONTAINS]->(e)",
new { normalized, text = entity.Text,
label = entity.Label, docId }, ct);
}
}
} Step 2: Hybrid Query
For a search query, run vector search and graph traversal in parallel, then fuse results:
public class HybridSearchService
{
private readonly IVectorStore _vectors;
private readonly IGraphClient _graph;
private readonly RrfEnsembleSearch _rrf;
public async Task<List<ScoredDocument>> SearchAsync(
string query, string userId, CancellationToken ct)
{
// Vector search
var vectorTask = _vectors.SearchAsync(query, topK: 20, ct);
// Graph: find entities in query, traverse to documents
var graphTask = GraphSearchAsync(query, userId, ct);
await Task.WhenAll(vectorTask, graphTask);
return _rrf.Fuse(new[]
{
new RankedList(await vectorTask, Weight: 1.0),
new RankedList(await graphTask, Weight: 0.7),
});
}
private async Task<List<RetrievedDocument>> GraphSearchAsync(
string query, string userId, CancellationToken ct)
{
// Extract entities from query text
var queryEntities = await _ner.ExtractAsync(query, ct);
if (queryEntities.Count == 0) return new();
var normalized = queryEntities
.Select(e => e.Text.ToLowerInvariant().Trim())
.ToList();
// Find documents containing these entities (max 2 hops)
var result = await _graph.ExecuteAsync<GraphDoc>(@"
MATCH (e:Entity)
WHERE e.normalized_text IN $entities
MATCH (d:Document)-[:CONTAINS]->(e)
MATCH (u:User {id: $userId})-[:OWNS]->(d)
RETURN DISTINCT d.id AS id, d.title AS title,
count(e) AS entity_overlap
ORDER BY entity_overlap DESC
LIMIT 20",
new { entities = normalized, userId }, ct);
return result.Select(r => new RetrievedDocument(
r.Id, r.Title, Score: r.EntityOverlap
)).ToList();
}
} When to Use Hybrid vs Vector-Only
| Scenario | Best approach |
|---|---|
| Free-form semantic search | Vector-only |
| Entity-specific queries | Graph + vector |
| ”Related documents” features | Graph traversal |
| Cold-start (no graph data yet) | Vector with graceful fallback |
Conclusion
Building this hybrid pipeline reinforced a core belief I keep coming back to: the best retrieval systems don’t pick one approach — they layer complementary strengths. Vector search gives you the semantic flexibility to handle vague, exploratory queries. Graph traversal gives you the structural precision to answer relational questions that embeddings simply cannot encode. The combination, fused through RRF, consistently outperforms either approach in isolation.
The biggest lesson was that building the graph itself is the real investment. The retrieval code is straightforward; the months of work went into entity extraction quality, disambiguation (is “Acme” the same entity as “ACME Corp”?), and keeping the graph in sync with document updates. If you’re starting a GraphRAG project, spend your first sprint on entity extraction quality — everything downstream depends on it.
Next Steps
- Agentic RRF Ensembling for Production Search — how we weight and fuse multiple retrieval signals including graph results
- GraphRAG Routing with Fallback Strategies — what happens when the graph database goes down