Optimizing Kubernetes Images for Homelab Resources

When I first configured my k3s cluster to run the full BlueRobin stack — API gateway, three worker services, PostgreSQL, NATS, and a monitoring suite — I ran straight into OOMKilled events within the first hour. My nodes had 8 GB of RAM each and the default .NET images were consuming over 250 MB per pod before the application even processed its first request. I spent that first weekend frantically reading container runtime metrics and wondering why kubectl top pods showed memory climbing relentlessly. The breakthrough came when I discovered Microsoft’s Chiseled Ubuntu images and Native AOT compilation, which together slashed my per-pod footprint by nearly 90%. This article captures that entire optimization journey.

Introduction

Our system runs on a modest Kubernetes cluster (K3s). We don’t have infinite cloud RAM. When running 15+ pods (API, Workers, Databases, Monitoring), every megabyte counts.

Standard .NET container images are “safe” but bloated. They contain shells, package managers, and binaries we never use.

Why Optimization Matters:

Density: Run more services on the same hardware.
Security: “Chiseled” images have no shell (/bin/sh), minimizing attack surface.
Startup Time: Native AOT starts in milliseconds, critical for scaling.

[.NET Container Images] — Microsoft , 2024-11-12 [Ubuntu Chiselled Containers] — Canonical , 2024-03-15

What We’ll Build

We will transform our Dockerfile from a standard 300MB image to a highly optimized 30MB image using multi-stage builds and Chiseled Ubuntu.

Architecture Overview

We rely on Microsoft’s “Chiseled” images—stripped-down versions of Ubuntu designed solely for running an app, no administration tools included.

flowchart LR
    SRC["Source Code"] --> SDK["SDK Build Stage\n(mcr.microsoft.com/dotnet/sdk:10.0)"]

    SDK --> A["Chiseled Runtime\n~90 MB image\n~80 MB RAM"]
    SDK --> B["Chiseled + Trimmed\n~50 MB image\n~55 MB RAM"]
    SDK --> C["Native AOT\n~25 MB image\n~18 MB RAM"]

    style SRC fill:#1a2744,stroke:#94a3b8,color:#e2e8f0
    style SDK fill:#1a2744,stroke:#94a3b8,color:#e2e8f0
    style A fill:#1a2744,stroke:#f59e0b,color:#e2e8f0
    style B fill:#1a2744,stroke:#6366f1,color:#e2e8f0
    style C fill:#1a2744,stroke:#22c55e,color:#e2e8f0

Section 1: The Multi-Stage Build

We compile in a full SDK container, but publish to a minimal runtime.

# Build Stage
FROM mcr.microsoft.com/dotnet/sdk:10.0 AS build
WORKDIR /src
COPY . .
RUN dotnet publish "src/MyApp.Api" -c Release -o /app/publish /p:UseAppHost=false

# Runtime Stage (Chiseled)
FROM mcr.microsoft.com/dotnet/aspnet:10.0-noble-chiseled
WORKDIR /app
COPY --from=build /app/publish .
ENTRYPOINT ["dotnet", "MyApp.Api.dll"]

This simple change drops the image size from ~250MB (Debian default) to ~90MB.

Section 2: Native AOT (Ahead-of-Time)

For our Worker services (which process queues), we can go further. Native AOT compiles the C# code directly to machine code, removing the need for the JIT compiler and part of the runtime.

[Native AOT Deployment] — Microsoft , 2024-11-12

Project File Changes:

<PropertyGroup>
    <PublishAot>true</PublishAot>
    <InvariantGlobalization>true</InvariantGlobalization>
</PropertyGroup>

Dockerfile for AOT:

# Runtime Stage (Deps only)
FROM mcr.microsoft.com/dotnet/runtime-deps:10.0-noble-chiseled-aot
WORKDIR /app
COPY --from=build /app/publish .
ENTRYPOINT ["./MyApp.Workers"]

Result:

Image Size: ~25MB
Startup Time: 15ms
Memory Footprint: ~18MB RAM (idle)

Section 3: Tree Shaking

Even without AOT, .NET 10 performs “trimming” (Tree Shaking) during publish. It analyzes your code and removes unused classes from the System libraries.

[Trim self-contained deployments and executables] — Microsoft , 2024-11-12

We ensure this is enabled in our Directory.Build.props:

<PropertyGroup>
    <PublishTrimmed>true</PublishTrimmed>
    <TrimMode>partial</TrimMode>
</PropertyGroup>

Conclusion

By caring about our artifacts, we reduced the total memory footprint of our application layer by 60%. This allows us to allocate more RAM to where it’s actually needed: the Database and Vector Index.

[Best practices for writing Dockerfiles] — Docker , 2024-10-01

The optimization work described here turned my cluster from a fragile, OOMKill-prone system into something genuinely stable. Before these changes, I could barely run 10 pods on a single 8 GB node; after Chiseled images and AOT, I comfortably run 20+ pods with headroom to spare. The most surprising benefit was not the memory savings alone but the startup speed: AOT workers restart in under 20ms, which makes rolling deployments essentially invisible to clients. If you are running a homelab or any resource-constrained Kubernetes environment, image optimization is not a nice-to-have — it is the difference between a cluster that works and one that constantly fights you.

Next Steps

Verify performance gains with Benchmarking.
See how this enables faster Contract Testing pipelines.
Profile memory usage over time with Prometheus and Grafana to validate real-world savings.
Experiment with Alpine-based images for non-.NET services to achieve similar size reductions.

Introduction

What We’ll Build

Architecture Overview

Section 1: The Multi-Stage Build

Section 2: Native AOT (Ahead-of-Time)

Section 3: Tree Shaking

Conclusion

Next Steps

Further Reading