Fork This Setup — Deployment Checklist¶
This guide is for people adapting this project to their own hardware
The original homelab is fully deployed and running — 3-node K3s HA cluster, 30+ apps, all managed via GitOps. If you want to run the same stack on your own machines, follow these steps to adapt the repo to your infrastructure.
This checklist covers every action required to go from a fresh fork to a fully running cluster,
in the order they must be performed.
Each item is a single, concrete action you can complete independently.
Phase 0 — Prerequisites (workstation)¶
-
0.1 Install required tools on your workstation:
-
0.2 Fork / push this repo to your own GitHub account and update the repo URL:
-
0.3 Choose your domain (e.g.
home.example.com) and replace everydaviddelporte.complaceholder:
Phase 1 — Secrets & Encryption¶
-
1.1 Generate your workstation age key:
-
1.2 Copy your workstation age public key into
.sops.yaml: -
1.3 Boot each node from the NixOS minimal ISO and collect its age key (generated by sops-nix during first boot):
# After first boot of each node: ssh root@<redacted> "cat /etc/ssh/ssh_host_ed25519_key.pub | ssh-to-age" ssh root@<redacted> "cat /etc/ssh/ssh_host_ed25519_key.pub | ssh-to-age" ssh root@<redacted> "cat /etc/ssh/ssh_host_ed25519_key.pub | ssh-to-age" # Paste each result into .sops.yaml under the matching node comment -
1.4 Fill in all secret values in
secrets/secrets.yamland encrypt it:# Edit the file — replace every REPLACE_WITH_... value: sops secrets/secrets.yaml # sops opens $EDITOR; fill in real values, save and quit — file is auto-encryptedRequired secrets to fill:
k3s.token— generate with:openssl rand -hex 32vaultwarden.adminToken— generate with:openssl rand -base64 48grafana.adminPassword— strong passwordcloudflare.apiToken— Cloudflare API token (DNS-01 cert-manager) All app DB passwords — generate with:openssl rand -hex 16Note: Renovate runs as a GitHub App — no PAT needed.
-
1.5 Verify the encrypted file looks correct (all values show
ENC[...):
Phase 2 — Hardware Inventory¶
Note:
smart-deploy.sh(used during deployment) automates MAC/disk discovery, NixOS config patching, and age key generation. This manual phase is the alternative if you prefer to collect hardware info separately before deploying.
-
2.1 Boot each node from the NixOS minimal ISO and collect hardware info:
Record the output for each node. -
2.2 Update NIC interface names in host configs:
-
2.3 Update MAC addresses in each host config:
-
2.4 Verify disk device name in
modules/disk-config.nix(default:/dev/sda): -
2.5 Add your SSH public key(s) to each host config:
-
2.6 Update the Flannel interface name in
modules/k3s-server-init.nix:
Phase 3 — Node Deployment¶
Complete Phases 0–2 before deploying any nodes.
-
3.1 Validate the Nix flake evaluates cleanly:
-
3.2 Deploy node1 (cluster-init, etcd leader):
Wait for node1 to reboot and the K3s API to be reachable: -
3.3 Copy kubeconfig from node1 to your workstation:
-
3.4 Deploy node2:
-
3.5 Deploy node3:
-
3.6 Verify all 3 nodes are Ready and etcd is healthy:
Phase 4 — Bootstrap ArgoCD¶
-
4.1 Run the ArgoCD bootstrap script:
-
4.2 Apply the app-of-apps root Application:
-
4.3 Monitor the ArgoCD sync waves in order:
Phase 5 — TrueNAS Storage¶
Required before Longhorn backups and RomM work.
- 5.1 On TrueNAS SCALE, create an NFS dataset for Longhorn backups:
- Dataset:
datapool/longhorn-backup - NFS share path:
/mnt/datapool/longhorn-backup -
Allow hosts:
<redacted>, <redacted>, <redacted> -
5.2 Enable Longhorn backup target in
apps/longhorn/application.yaml: -
5.3 For RomM — create an NFS dataset for ROM files:
- Dataset:
datapool/roms - NFS share path:
/mnt/datapool/roms
Phase 6 — DNS & TLS¶
-
6.1 Create wildcard DNS record pointing to Traefik's LoadBalancer IP (
Alternatively, add individual A records for each service.<redacted>by default): -
6.2 Verify Let's Encrypt staging certificates issue correctly (no rate-limit risk):
If using staging certificates, your browser will show an untrusted cert warning — this is normal.
-
6.3 Switch to Let's Encrypt production issuer once staging works:
Phase 7 — Alertmanager¶
-
7.1 Add an Alertmanager receiver to
apps/monitoring/application.yaml. Example using a Discord webhook:alertmanager: config: global: resolve_timeout: 5m route: group_by: ['alertname', 'namespace'] group_wait: 30s group_interval: 5m repeat_interval: 4h receiver: discord receivers: - name: discord discord_configs: - webhook_url: https://discord.com/api/webhooks/YOUR_WEBHOOK_ID/TOKEN title: '{{ .GroupLabels.alertname }}' message: '{{ range .Alerts }}{{ .Annotations.summary }}{{ end }}' -
7.2 Commit the change and verify Alertmanager is correctly configured:
Phase 8 — Post-Deployment Hardening¶
-
8.1 Disable Vaultwarden admin token after initial setup:
-
8.2 Restrict Actual Budget to internal network only — verify the Traefik middleware
actual-budget-ipallowexists (or create it): -
8.3 Enable Longhorn recurring snapshots via the Longhorn UI or a
RecurringJobCRD: -
8.4 Configure Renovate: add your GitHub PAT as
RENOVATE_TOKENin GitHub Actions secrets so Renovate can open automatic dependency update PRs. -
8.5 Set up Uptime Kuma monitors for all services via its web UI at
status.example.comafter deployment. -
8.6 Review and tighten K8s RBAC — currently using
defaultArgoCD project (unrestricted). Consider creating per-team ArgoCD projects with resource restrictions.
Quick Reference — Status Check Commands¶
# Cluster overview
kubectl get nodes -o wide
kubectl get pods -A | grep -v Running | grep -v Completed
# ArgoCD sync status
kubectl -n argocd get applications
# Certificate status
kubectl get certificate -A
# Longhorn volumes
kubectl -n longhorn-system get volumes
# Alerts firing
kubectl -n monitoring exec -it \
$(kubectl -n monitoring get pod -l app.kubernetes.io/name=alertmanager -o name | head -1) \
-- amtool alert query