Database Intermediate 18 min

Qdrant Vector Database on Kubernetes

Deploy and configure Qdrant for production semantic search with collection management, filtering, and high availability on Kubernetes.

By Victor Robin Updated:

Introduction

When I first deployed Qdrant on Kubernetes, I spent an entire week debugging why vector search returned inconsistent results between pod restarts. Searches that worked perfectly on Monday would return zero results on Tuesday after a routine rolling update. I checked embedding dimensions, collection configurations, and client connection strings — everything looked correct. Then I finally inspected the storage and realized the initial deployment used emptyDir volumes, so all vectors were wiped every time a pod restarted. The fix was straightforward — migrating to a proper StatefulSet with PersistentVolumeClaims — but the lesson was hard-won: in Kubernetes, if you do not explicitly make storage persistent, it is not.

Qdrant is a high-performance vector database written in Rust, designed for the next generation of AI applications. Unlike traditional search engines that rely on keywords, Qdrant enables semantic search—finding documents based on meaning rather than exact text matches.

Why Vector Databases Matter:

  • Semantic Understanding: Find “vacation policy” when the user searches “time off rules”
  • AI-Native: Store embeddings from LLMs like OpenAI, Ollama, or sentence-transformers
  • Hybrid Search: Combine vector similarity with metadata filters for precise results
  • Scale: Handle millions of vectors with sub-millisecond query latency

For our document intelligence system features, Qdrant is the backbone of our search. Users can ask natural language questions and find relevant documents even when the exact words don’t match.

Why Qdrant Matters:

  • Performance: Built in Rust, it offers incredibly low latency even with millions of vectors. The HNSW (Hierarchical Navigable Small World) index used by Qdrant builds on foundational research in efficient approximate nearest neighbor search.
  • Filtering: First-class support for “Payload Filtering”, allowing you to combine semantic search with strict conditions (e.g., “Find similar docs owned by User X”).
  • Kubernetes Ready: Cloud-native architecture that scales horizontally.
[Billion-scale similarity search with GPUs (FAISS)] — Johnson, J., Douze, M. & Jegou, H. , 2019

What We’ll Build

In this guide, we will set up the search engine for our document platform. You will learn how to:

  1. Deploy Qdrant: Configure a StatefulSet with persistent storage in Kubernetes.
  2. Manage Collections: Define optimized vector schemas (Distance metrics, Dimensions).
  3. Integrate .NET client: Perform hybrid search (Metadata filters + Vector similarity).

Deployment Architecture

flowchart TB
    subgraph Cluster["🔮 Qdrant Cluster"]
        subgraph Nodes["⚡ Qdrant Nodes"]
            Q0["👑 Qdrant-0\n(Leader)"]
            Q1["🔄 Qdrant-1\n(Replica)"]
            Q2["🔄 Qdrant-2\n(Replica)"]
            Q0 <--> Q1
            Q1 <--> Q2
            Q0 <--> Q2
        end
        
        Nodes --> Storage
        
        subgraph Storage["💾 Collections"]
            direction LR
            C1["🧪 dev-documents\n(384 dims)"]
            C2["🎭 staging-documents\n(384 dims)"]
            C3["🚀 prod-documents\n(384 dims)"]
        end
    end

    classDef primary fill:#7c3aed,color:#fff
    classDef secondary fill:#06b6d4,color:#fff
    classDef db fill:#f43f5e,color:#fff
    classDef warning fill:#fbbf24,color:#000
    class Cluster,Nodes,Storage db

Kubernetes Manifests

StatefulSet

Qdrant requires a StatefulSet rather than a Deployment because each node maintains its own state (vector data, WAL segments) that must survive pod restarts. The Kubernetes StatefulSet provides stable network identities and persistent storage per pod, which is exactly what a clustered database needs.

[Kubernetes StatefulSets Documentation] — Kubernetes , 2024
# infrastructure/data-layer/qdrant/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: qdrant
  namespace: data-layer
spec:
  serviceName: qdrant
  replicas: 3
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      app: qdrant
  template:
    metadata:
      labels:
        app: qdrant
    spec:
      containers:
        - name: qdrant
          image: qdrant/qdrant:v1.7.4
          ports:
            - containerPort: 6333
              name: http
            - containerPort: 6334
              name: grpc
            - containerPort: 6335
              name: cluster
          env:
            - name: QDRANT__CLUSTER__ENABLED
              value: "true"
            - name: QDRANT__CLUSTER__P2P__PORT
              value: "6335"
            - name: QDRANT__SERVICE__GRPC_PORT
              value: "6334"
          volumeMounts:
            - name: data
              mountPath: /qdrant/storage
            - name: config
              mountPath: /qdrant/config/production.yaml
              subPath: production.yaml
          resources:
            requests:
              memory: "512Mi"
              cpu: "200m"
            limits:
              memory: "2Gi"
              cpu: "2000m"
          readinessProbe:
            httpGet:
              path: /healthz
              port: 6333
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /healthz
              port: 6333
            initialDelaySeconds: 30
            periodSeconds: 30
      volumes:
        - name: config
          configMap:
            name: qdrant-config
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: local-path
        resources:
          requests:
            storage: 20Gi

Configuration

The on_disk_payload: true setting in the configuration is one of the most impactful optimizations for production deployments.

[Qdrant Configuration and Optimization] — Qdrant , 2024
# infrastructure/data-layer/qdrant/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: qdrant-config
  namespace: data-layer
data:
  production.yaml: |
    log_level: INFO
    
    storage:
      storage_path: /qdrant/storage
      snapshots_path: /qdrant/snapshots
      on_disk_payload: true
      
      performance:
        max_search_threads: 0  # Auto-detect
        max_optimization_threads: 2
        
      optimizers:
        deleted_threshold: 0.2
        vacuum_min_vector_number: 1000
        default_segment_number: 4
        max_segment_size_kb: 200000
        memmap_threshold_kb: 50000
        indexing_threshold_kb: 20000
    
    service:
      max_request_size_mb: 32
      enable_tls: false
      
    cluster:
      enabled: true
      p2p:
        port: 6335
      consensus:
        tick_period_ms: 100

Services

The service configuration is critical for Qdrant clustering. Notice the first service has clusterIP: None — this is a headless service required for peer discovery in the Qdrant cluster.

[Qdrant Distributed Deployment (Clustering Guide)] — Qdrant , 2024
# infrastructure/data-layer/qdrant/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: qdrant
  namespace: data-layer
spec:
  selector:
    app: qdrant
  ports:
    - name: http
      port: 6333
      targetPort: 6333
    - name: grpc
      port: 6334
      targetPort: 6334
  clusterIP: None
---
apiVersion: v1
kind: Service
metadata:
  name: qdrant-api
  namespace: data-layer
spec:
  selector:
    app: qdrant
  ports:
    - name: http
      port: 6333
      targetPort: 6333
    - name: grpc
      port: 6334
      targetPort: 6334
  type: ClusterIP

.NET Integration

Qdrant Client Setup

// Infrastructure/DependencyInjection.cs
public static IServiceCollection AddQdrantServices(
    this IServiceCollection services,
    IConfiguration configuration)
{
    services.AddSingleton(sp =>
    {
        var host = configuration["Qdrant:Host"] ?? "qdrant-api.data-layer.svc.cluster.local";
        var port = configuration.GetValue<int>("Qdrant:Port", 6334);
        
        return new QdrantClient(host, port);
    });
    
    services.AddScoped<IVectorStore, QdrantVectorStore>();
    services.AddScoped<ICollectionManager, QdrantCollectionManager>();
    
    return services;
}

Collection Manager

// Infrastructure/VectorDb/QdrantCollectionManager.cs
public sealed class QdrantCollectionManager : ICollectionManager
{
    private readonly QdrantClient _client;
    private readonly IConfiguration _config;
    private readonly ILogger<QdrantCollectionManager> _logger;
    
    private const int VectorDimension = 384; // nomic-embed-text
    
    public QdrantCollectionManager(
        QdrantClient client,
        IConfiguration config,
        ILogger<QdrantCollectionManager> logger)
    {
        _client = client;
        _config = config;
        _logger = logger;
    }
    
    public async Task EnsureCollectionExistsAsync(CancellationToken ct = default)
    {
        var collectionName = GetCollectionName();
        
        try
        {
            await _client.GetCollectionInfoAsync(collectionName, ct);
            _logger.LogDebug("Collection {Collection} already exists", collectionName);
            return;
        }
        catch (QdrantException ex) when (ex.Message.Contains("doesn't exist"))
        {
            // Collection doesn't exist, create it
        }
        
        await _client.CreateCollectionAsync(
            collectionName,
            new VectorParams
            {
                Size = VectorDimension,
                Distance = Distance.Cosine,
                OnDisk = true
            },
            cancellationToken: ct);
        
        // Create payload indexes for filtering
        await _client.CreatePayloadIndexAsync(
            collectionName,
            "owner_id",
            PayloadSchemaType.Keyword,
            cancellationToken: ct);
        
        await _client.CreatePayloadIndexAsync(
            collectionName,
            "document_id",
            PayloadSchemaType.Keyword,
            cancellationToken: ct);
        
        await _client.CreatePayloadIndexAsync(
            collectionName,
            "tags",
            PayloadSchemaType.Keyword,
            cancellationToken: ct);
        
        await _client.CreatePayloadIndexAsync(
            collectionName,
            "created_at",
            PayloadSchemaType.Datetime,
            cancellationToken: ct);
        
        _logger.LogInformation(
            "Created collection {Collection} with {Dimension} dimensions",
            collectionName,
            VectorDimension);
    }
    
    public async Task<CollectionInfo> GetCollectionInfoAsync(CancellationToken ct = default)
    {
        var collectionName = GetCollectionName();
        var info = await _client.GetCollectionInfoAsync(collectionName, ct);
        
        return new CollectionInfo
        {
            Name = collectionName,
            VectorCount = (long)info.PointsCount,
            Status = info.Status.ToString()
        };
    }
    
    private string GetCollectionName()
    {
        var env = _config["Environment"] ?? "dev";
        return $"{env}-documents";
    }
}

public sealed record CollectionInfo
{
    public required string Name { get; init; }
    public required long VectorCount { get; init; }
    public required string Status { get; init; }
}

Vector Store

// Infrastructure/VectorDb/QdrantVectorStore.cs
public sealed class QdrantVectorStore : IVectorStore
{
    private readonly QdrantClient _client;
    private readonly IConfiguration _config;
    private readonly ILogger<QdrantVectorStore> _logger;
    
    public QdrantVectorStore(
        QdrantClient client,
        IConfiguration config,
        ILogger<QdrantVectorStore> logger)
    {
        _client = client;
        _config = config;
        _logger = logger;
    }
    
    public async Task UpsertAsync(
        IReadOnlyList<VectorPoint> points,
        CancellationToken ct = default)
    {
        if (points.Count == 0) return;
        
        var collectionName = GetCollectionName();
        
        var qdrantPoints = points.Select(p => new PointStruct
        {
            Id = new PointId { Uuid = p.Id.ToString() },
            Vectors = p.Vector,
            Payload = 
            {
                ["owner_id"] = p.OwnerId,
                ["document_id"] = p.DocumentId,
                ["chunk_index"] = p.ChunkIndex,
                ["content"] = p.Content,
                ["tags"] = p.Tags ?? [],
                ["created_at"] = p.CreatedAt.ToString("O")
            }
        }).ToList();
        
        await _client.UpsertAsync(
            collectionName,
            qdrantPoints,
            cancellationToken: ct);
        
        _logger.LogDebug(
            "Upserted {Count} points to {Collection}",
            points.Count,
            collectionName);
    }
    
    public async Task<IReadOnlyList<SearchResult>> SearchAsync(
        float[] queryVector,
        CustomId ownerId,
        int limit = 10,
        float scoreThreshold = 0.5f,
        SearchFilters? filters = null,
        CancellationToken ct = default)
    {
        var collectionName = GetCollectionName();
        
        // Build filter
        var mustConditions = new List<Condition>
        {
            new()
            {
                Field = new FieldCondition
                {
                    Key = "owner_id",
                    Match = new Match { Keyword = ownerId.Value }
                }
            }
        };
        
        if (filters?.DocumentIds?.Count > 0)
        {
            mustConditions.Add(new Condition
            {
                Field = new FieldCondition
                {
                    Key = "document_id",
                    Match = new Match
                    {
                        Any = new RepeatedStrings
                        {
                            Strings = { filters.DocumentIds.Select(d => d.Value) }
                        }
                    }
                }
            });
        }
        
        if (filters?.Tags?.Count > 0)
        {
            mustConditions.Add(new Condition
            {
                Field = new FieldCondition
                {
                    Key = "tags",
                    Match = new Match
                    {
                        Any = new RepeatedStrings
                        {
                            Strings = { filters.Tags }
                        }
                    }
                }
            });
        }
        
        var results = await _client.SearchAsync(
            collectionName,
            queryVector,
            filter: new Filter { Must = { mustConditions } },
            limit: (ulong)limit,
            scoreThreshold: scoreThreshold,
            cancellationToken: ct);
        
        return results.Select(r => new SearchResult
        {
            Id = r.Id.Uuid,
            DocumentId = CustomId.From(r.Payload["document_id"].StringValue),
            ChunkIndex = (int)r.Payload["chunk_index"].IntegerValue,
            Content = r.Payload["content"].StringValue,
            Score = r.Score
        }).ToList();
    }
    
    public async Task DeleteByDocumentAsync(
        CustomId documentId,
        CustomId ownerId,
        CancellationToken ct = default)
    {
        var collectionName = GetCollectionName();
        
        await _client.DeleteAsync(
            collectionName,
            new Filter
            {
                Must =
                {
                    new Condition
                    {
                        Field = new FieldCondition
                        {
                            Key = "owner_id",
                            Match = new Match { Keyword = ownerId.Value }
                        }
                    },
                    new Condition
                    {
                        Field = new FieldCondition
                        {
                            Key = "document_id",
                            Match = new Match { Keyword = documentId.Value }
                        }
                    }
                }
            },
            cancellationToken: ct);
        
        _logger.LogInformation(
            "Deleted vectors for document {DocumentId}",
            documentId);
    }
    
    public async Task<long> CountByOwnerAsync(
        CustomId ownerId,
        CancellationToken ct = default)
    {
        var collectionName = GetCollectionName();
        
        var result = await _client.CountAsync(
            collectionName,
            filter: new Filter
            {
                Must =
                {
                    new Condition
                    {
                        Field = new FieldCondition
                        {
                            Key = "owner_id",
                            Match = new Match { Keyword = ownerId.Value }
                        }
                    }
                }
            },
            cancellationToken: ct);
        
        return (long)result;
    }
    
    private string GetCollectionName()
    {
        var env = _config["Environment"] ?? "dev";
        return $"{env}-documents";
    }
}

public sealed record VectorPoint
{
    public required Guid Id { get; init; }
    public required string OwnerId { get; init; }
    public required string DocumentId { get; init; }
    public required int ChunkIndex { get; init; }
    public required string Content { get; init; }
    public required float[] Vector { get; init; }
    public IReadOnlyList<string>? Tags { get; init; }
    public DateTimeOffset CreatedAt { get; init; } = DateTimeOffset.UtcNow;
}

public sealed record SearchFilters
{
    public IReadOnlyList<CustomId>? DocumentIds { get; init; }
    public IReadOnlyList<string>? Tags { get; init; }
    public DateTimeOffset? CreatedAfter { get; init; }
    public DateTimeOffset? CreatedBefore { get; init; }
}

public sealed record SearchResult
{
    public required string Id { get; init; }
    public required CustomId DocumentId { get; init; }
    public required int ChunkIndex { get; init; }
    public required string Content { get; init; }
    public required float Score { get; init; }
}
[Qdrant Filtering Documentation] — Qdrant , 2024

Health Checks

// Infrastructure/HealthChecks/QdrantHealthCheck.cs
public sealed class QdrantHealthCheck : IHealthCheck
{
    private readonly QdrantClient _client;
    private readonly ICollectionManager _collections;
    
    public QdrantHealthCheck(
        QdrantClient client,
        ICollectionManager collections)
    {
        _client = client;
        _collections = collections;
    }
    
    public async Task<HealthCheckResult> CheckHealthAsync(
        HealthCheckContext context,
        CancellationToken cancellationToken = default)
    {
        try
        {
            var info = await _collections.GetCollectionInfoAsync(cancellationToken);
            
            return HealthCheckResult.Healthy(
                $"Qdrant healthy: {info.VectorCount} vectors in {info.Name}");
        }
        catch (Exception ex)
        {
            return HealthCheckResult.Unhealthy(
                "Qdrant health check failed",
                exception: ex);
        }
    }
}
[gRPC Performance Best Practices] — gRPC Authors , 2024

Summary

Deploying Qdrant on Kubernetes has been a journey of hard-won lessons. From the initial emptyDir disaster to the headless service debugging session, each production incident taught me something about the intersection of distributed databases and container orchestration. The system we ended up with — a 3-node StatefulSet with proper PVCs, on-disk payloads, payload indexes, and gRPC communication — handles our 500K+ vector workload reliably with sub-50ms query latency.

If I were starting this deployment from scratch, the three things I would get right on day one are: (1) StatefulSet with PersistentVolumeClaims, not Deployments with emptyDir; (2) headless service for peer discovery, separate from the ClusterIP service for application traffic; and (3) payload indexes on every field you plan to filter by in production.

FeatureConfiguration
StorageOn-disk payload for memory efficiency
FilteringPayload indexes on owner_id, document_id
Clustering3-node cluster for HA
DistanceCosine similarity for semantic search
ProtocolgRPC (6334) for production, REST (6333) for debugging

Combined with proper collection management and filtering, Qdrant enables efficient multi-tenant semantic search at scale.

Next Steps

Further Reading

[Qdrant Documentation] — Qdrant , 2024 [Billion-scale similarity search with GPUs (FAISS)] — Johnson, J., Douze, M. & Jegou, H. , 2019 [Qdrant Filtering Documentation] — Qdrant , 2024