Deployment Guide¶
Step-by-step guide to bootstrap the cluster from bare metal to fully running.
What is manual vs automated?¶
YOU do: AUTOMATED:
─────────────────────────────────── ────────────────────────────────────────
1. Boot nodes from Ventoy USB → NixOS live environment
2. Run smart-deploy.sh per node → Hardware discovered (MAC + disk)
→ SSH host key generated
→ Age key derived + added to .sops.yaml
→ Secrets re-encrypted for the node
→ NixOS installed + K3s started
3. Run bootstrap-argocd.sh → ArgoCD installed
→ App of Apps synced
→ Kyverno, MetalLB, Traefik, cert-manager,
Longhorn, all apps deployed in order
Phase 0 — Workstation setup (once)¶
# Clone repo and enter dev shell (provides all tools)
git clone https://github.com/Yasuke2000/Homelab.git
cd Homelab
nix develop
# Now you have: kubectl, helm, sops, age, ssh-to-age, nixos-anywhere, k9s
Phase 1 — Provision nodes¶
Boot each HP EliteDesk from the Ventoy USB (see Node Provisioning for USB setup).
Once the NixOS installer is running and you have the DHCP IP, run:
# Node 1 — bootstraps the etcd cluster (always first)
bash scripts/smart-deploy.sh 192.168.1.50 node1 server-init
# Node 2 — joins the cluster
bash scripts/smart-deploy.sh 192.168.1.51 node2 server-join
# Node 3 — joins the cluster
bash scripts/smart-deploy.sh 192.168.1.52 node3 server-join
Each run automatically:
- SSHes to the temporary IP and detects the MAC address and primary disk
- Patches
hosts/<node>/default.nixwith real hardware values - Generates an SSH host key and derives the node age key
- Adds the node age key to
.sops.yamland re-encrypts all secrets - Commits the changes to git
- Deploys NixOS via nixos-anywhere (wipes disk, fully unattended)
- Waits for the node to come up on its static IP
- Verifies K3s cluster membership
Verify after each deploy:
# Check all nodes are Ready
ssh root@10.0.20.11 kubectl get nodes
# Expected output after all 3:
# NAME STATUS ROLES AGE
# homelab-node1 Ready control-plane,etcd,master 5m
# homelab-node2 Ready control-plane,etcd,master 3m
# homelab-node3 Ready control-plane,etcd,master 1m
Phase 2 — Bootstrap ArgoCD¶
# Copy kubeconfig from node1
scp root@10.0.20.11:/etc/rancher/k3s/k3s.yaml ./kubeconfig
sed -i 's/127.0.0.1/10.0.20.11/' kubeconfig
export KUBECONFIG=$(pwd)/kubeconfig
# Bootstrap
bash scripts/bootstrap-argocd.sh
ArgoCD then syncs everything automatically in order (sync-waves):
wave -5 → Kyverno installed
wave -4 → MetalLB + Kyverno Longhorn PATH fix loaded
wave -3 → Traefik installed → gets IP 10.0.20.100 from MetalLB
wave -2 → cert-manager installed
wave -1 → Longhorn installed (works correctly with Kyverno fix in place)
wave 0 → All apps: Vaultwarden, Jellyfin, Ghost, Silverbullet, ...
Monitor progress:
kubectl port-forward svc/argocd-server -n argocd 8080:443
# Open: https://localhost:8080
# Username: admin | Password: printed by bootstrap-argocd.sh
Phase 3 — TrueNAS NFS setup¶
Before Longhorn and media apps are fully functional, configure NFS on the TrueNAS node (10.0.20.14):
- Open TrueNAS web UI at
http://10.0.20.14 - Create datasets:
longhorn-backup,media,roms - Enable NFS shares for each with network
10.0.20.0/24
Phase 4 — Switch TLS to production¶
Once all apps are running with staging certificates:
# Verify staging certs are all issued
kubectl get certificates -A
# Switch everything to production (after verifying staging works)
grep -rl "letsencrypt-staging" apps/ | xargs sed -i 's/letsencrypt-staging/letsencrypt-prod/g'
git add apps/ && git commit -m "feat: switch to letsencrypt-prod" && git push
See Staging to Production for the full checklist.
Phase summary¶
| Phase | You run | Result |
|---|---|---|
| 1a | smart-deploy.sh <ip> node1 server-init |
NixOS + K3s cluster initialized |
| 1b | smart-deploy.sh <ip> node2/3 server-join |
HA cluster with 3 nodes |
| 2 | bootstrap-argocd.sh |
All 11 apps deployed via GitOps |
| 3 | TrueNAS web UI | NFS storage available |
| 4 | One sed + git push | Production TLS everywhere |