Sovereign Bare-Metal Homelab¶
Production-grade Kubernetes on NixOS — fully declarative, GitOps-driven, zero-trust secured. Every node, every certificate, every secret, and every deployment is managed entirely through code.
Architecture¶
graph TD
User((User)) -->|HTTPS| CF[Cloudflare DNS + Tunnel]
CF -->|Encrypted tunnel| Traefik
Repo[(GitHub — NixOS configs · K8s manifests · encrypted secrets)] -->|reconcile| ArgoCD
subgraph Cluster[K3s HA Cluster — embedded etcd]
direction TB
subgraph Infra[Infrastructure Layer]
direction LR
Traefik[Traefik v3 — Ingress]
ArgoCD[ArgoCD v3 — GitOps]
Longhorn[Longhorn — Storage]
Authelia[Authelia — SSO/2FA]
end
subgraph CP[Control Plane × 3]
direction LR
N1[node1 — etcd leader] --- N2[node2] --- N3[node3]
end
Traefik --> Authelia
Authelia --> CP
ArgoCD -.->|sync| CP
CP --> Apps
Longhorn -.->|volumes| Apps
subgraph Apps[30+ Self-hosted Services]
direction LR
Vault[Vaultwarden]
Ghost[Ghost]
Jelly[Jellyfin]
More[+ 27 more]
end
end
NAS[Synology NAS — NFS Backup + Media] -.->|backup target| Longhorn
What Makes This Different¶
Infrastructure as Code¶
Entire OS + cluster defined in NixOS flakes — zero manual configuration, full rollback on any change
Self-Healing GitOps¶
ArgoCD auto-syncs 30+ apps from git. Manual changes get reverted automatically — cluster always matches code
Zero-Trust Security¶
Authelia SSO/2FA on every service, Cloudflare Tunnel (no open ports), sops-encrypted secrets in git
Full Observability¶
Prometheus + Grafana + Alertmanager + Loki — metrics, logs, and alerts to Discord across all nodes
Disaster Recovery¶
Automated etcd snapshots, nightly DB backups to NAS, documented recovery runbooks
78–90% Cost Savings¶
Full platform for ~€17/month vs €80–200/month cloud equivalent — breaks even in 6–12 months
Tech Stack¶
| Layer | Technology | Purpose |
|---|---|---|
| OS | NixOS | Declarative, reproducible — entire OS defined in code |
| Cluster | K3s (embedded etcd) | Lightweight HA Kubernetes, no external etcd |
| GitOps | ArgoCD v3 | Self-healing — cluster always matches git |
| Auth | Authelia | SSO/2FA gateway — TOTP + WebAuthn via Traefik ForwardAuth |
| Secrets | sops-nix + age | Secrets encrypted in git, zero secret server |
| Ingress | Traefik v3 | Dynamic routing via Kubernetes CRDs |
| Edge | Cloudflare Tunnel | Zero exposed ports — encrypted tunnel ingress |
| LB | MetalLB | Bare-metal LoadBalancer IPs via L2 |
| Storage | Longhorn + NFS | Replicated block storage + NAS for media/backups |
| TLS | cert-manager | Automatic wildcard certs via DNS-01 |
| Policy | Kyverno | Mutation policies (NixOS PATH fix for Longhorn) |
| Metrics | kube-prometheus-stack | Prometheus + Grafana + Alertmanager |
| Logs | Loki + Promtail | Centralized log aggregation (7-day retention) |
| Alerts | Alertmanager + Apprise | Multi-channel alerts (Discord, extensible) |
| Uptime | Uptime Kuma | HTTP health checks + public status page |
| VPN | Gluetun | WireGuard VPN tunnel + kill switch for downloads |
Self-hosted Services (30+)¶
Platform & Security¶
| App | Purpose |
|---|---|
| ArgoCD | GitOps deployment controller |
| Traefik | Reverse proxy / ingress + TLS termination |
| Authelia | SSO/2FA gateway for all internal services |
| Cloudflare Tunnel | Encrypted ingress — no open ports |
| Longhorn UI | Distributed storage management |
Monitoring & Observability¶
| App | Purpose |
|---|---|
| Grafana | Metrics dashboards + log exploration |
| Prometheus | Metrics collection (nodes, pods, etcd, apps) |
| Alertmanager + Apprise | Alert routing to Discord |
| Loki + Promtail | Centralized log aggregation |
| Uptime Kuma | HTTP monitoring + public status page |
Media Automation¶
| App | Purpose |
|---|---|
| Jellyfin | Media server (movies, TV, music) |
| Jellyseerr | User-facing media request portal |
| Radarr | Automated movie management |
| Sonarr | Automated TV show management |
| Bazarr | Automated subtitle downloads |
| Prowlarr | Indexer management for *arr stack |
| Jackett | Indexer proxy for Cloudflare-protected sources |
| qBittorrent + Gluetun | VPN-routed downloads (WireGuard + kill switch) |
| Recyclarr | TRaSH quality profile sync (CronJob) |
| SuggestArr | Netflix-style auto-discovery from trending |
Productivity & Personal¶
| App | Purpose |
|---|---|
| Ghost | Personal blog / publishing platform |
| Vaultwarden | Password manager (Bitwarden-compatible) |
| Actual Budget | Personal finance / budgeting |
| Obsidian LiveSync | Offline-first notes with real-time sync |
| Jelu | Book library with ISBN barcode scanning |
| SilverBullet | Personal knowledge base |
| Homepage | Unified dashboard for all services |
Gaming & Entertainment¶
| App | Purpose |
|---|---|
| Pelican | Game server management panel |
| RoMM | ROM / retro game library manager |
Data Protection¶
| Service | Schedule | Target |
|---|---|---|
| DB Backups (Ghost, RoMM, Pelican) | Nightly 02:00 UTC | Synology NAS (7-day retention) |
| Longhorn Snapshots | Continuous | Cross-node 2-replica replication |
Engineering Highlights¶
Fully declarative infrastructure¶
Every node is defined in a NixOS flake. Provisioning a new bare-metal machine is a single command — the script discovers hardware, generates cryptographic keys, encrypts secrets for that specific node, installs NixOS, and verifies cluster membership, all without touching the machine manually.
Secrets encrypted in git¶
All secrets (database passwords, API tokens, TLS credentials) are encrypted with
age using sops. They live in the repo as
encrypted blobs — readable only by machines whose keys are in .sops.yaml.
No secret server, no environment variables in CI, no risk of accidental exposure.
Self-healing GitOps¶
ArgoCD monitors the git repo and continuously reconciles the cluster to match it.
If someone manually edits a resource, ArgoCD reverts it within minutes.
New apps are added by committing a Helm Application manifest — no kubectl apply
needed.
Zero-trust access¶
Authelia SSO/2FA sits in front of every internal service via Traefik ForwardAuth. Cloudflare Tunnel means zero ports are open to the internet. All remote admin goes through an encrypted mesh VPN with Ed25519 key-only SSH.
CI/CD Pipeline¶
Every push runs 6 automated checks:
| Check | What it validates |
|---|---|
nix flake check |
All NixOS configurations build without errors |
yamllint |
All YAML manifests pass linting |
kubeconform |
All Kubernetes manifests validate against current K8s API schemas |
sops-check |
No plaintext secrets committed |
trufflehog |
Scans for accidentally committed API keys or credentials |
line-endings |
LF only (CRLF breaks the Nix parser) |
Documentation¶
| Guide | What it covers |
|---|---|
| Hardware & Cost | Full bill of materials, running costs, sourcing guide |
| Gotchas | 14 hard-won lessons from building this |
| Fork This Setup | Step-by-step guide to deploy this stack on your own hardware |
Full operational documentation (deployment guides, recovery runbooks, TLS setup) is maintained in the private working repository.