⚙️ Infrastructure Advanced
⏱️ 12 min
Kubernetes Cluster Setup for Agentic AI Workloads
A practical cluster setup guide for running BlueRobin-style agentic services with reliable data, messaging, and observability foundations.
By Victor Robin • • Updated:
Introduction
Agentic and LLM-heavy systems are infrastructure-heavy systems. To run them reliably, your cluster needs clear boundaries between app workloads, data services, messaging, and platform controls.
Baseline Topology
archives-*namespaces for API, workers, web, and AI services.data-layerfor Postgres, MinIO, Qdrant, and NATS.aifor model-serving components like Ollama.monitoringfor observability and telemetry pipelines.
Critical Setup Decisions
- Use GitOps (
base + overlays) as the only deployment path. - Keep secrets in ExternalSecrets/Infisical flows.
- Use FQDN service addressing across namespaces.
- Validate health checks and endpoint ports per environment.
Operational Lessons from Recent Changes
- GraphRAG service deployment required overlay and secret fixes before stabilization.
- Incorrect service ports and OTEL endpoints caused avoidable runtime failures.
- CI integration for new services is required from day one to avoid drift.
Conclusion
Cluster setup is the main determinant of reliability for agentic systems. Start with strong namespace boundaries, declarative deployment, and explicit secrets/runtime wiring.
Related reading:
/gitops-flux-cd-introduction//graphrag-gitops-kustomize-externalsecrets/