Provisioning
From bare hardware to a fully running platform in three phases. Each phase hands off to the next — once the OS is reachable over SSH, a single Ansible run installs K3s and seeds ArgoCD, which then reconciles the entire platform automatically.
flowchart TD
subgraph P1["① OS Install"]
direction LR
DietPi["DietPi\nSBCs · OrangePi"]
Debian["Debian USB\nx86 mini-PCs"]
end
subgraph P2["② Ansible — datahub-local-bootstrap"]
direction LR
K3s["K3s Cluster\nsite.yml"]
Argo["ArgoCD seed\nbootstrap.yml"]
K3s --> Argo
end
subgraph P3["③ ArgoCD — auto-sync"]
direction LR
Secrets["datahub-local-secrets\nSealed Secrets"]
Core["datahub-local-core\nPlatform services"]
Workflows["datahub-local-workflows\nn8n · Airflow · SQLMesh"]
Secrets --> Core --> Workflows
end
DietPi -->|"SSH ready"| K3s
Debian -->|"SSH ready"| K3s
Argo -->|"auto-sync"| Secrets
Phase 1 — OS Installation
Install a base Linux OS on each node. Two supported paths:
Recommended for OrangePi and other SBC nodes — minimal footprint, pre-configured SSH, hardware optimisations out of the box.
- Download the DietPi image for your board from dietpi.com
- Flash to SD card or eMMC with Balena Etcher or
dd -
Before first boot, edit
dietpi.txton the boot partition: -
Boot the node — DietPi completes first-run setup automatically and enables SSH.
Recommended for the CHUWI UBox and CWWK NAS — standard Debian netinstall via USB.
- Download Debian netinstall ISO
- Flash to USB with Etcher or
dd - Boot from USB and follow the installer — minimal install, no desktop
- Enable SSH:
systemctl enable --now ssh
After OS install on all nodes:
- Assign static IPs at the OS level (edit
/etc/network/interfacesordietpi-config) - Confirm SSH access from your dev machine:
Phase 2 — Ansible
The datahub-local-bootstrap repository provisions the cluster in two sequential playbooks:
flowchart LR
Inv["inventory.yml\n(your nodes)"]
Inv --> Site
Inv --> Bootstrap
Site["site.yml\nOS hardening\nK3s install\nkubeconfig"]
Bootstrap["bootstrap.yml\nSealed Secrets controller\nArgoCD install\nArgoCD Applications seed"]
Site -->|"cluster Ready"| Bootstrap
Setup
-
Clone the repository on your dev machine:
-
Create your inventory file from the sample:
-
Edit
inventory.yml— define your nodes, roles, and connection details.
Run
-
Ensure Python prerequisites are available on all nodes:
-
Provision the OS and install K3s:
-
Install ArgoCD and seed the initial Applications:
After this phase completes, kubectl get nodes shows all nodes Ready and ArgoCD is running and watching its application repositories.
Phase 3 — ArgoCD GitOps
From this point ArgoCD takes over entirely. It reconciles three repositories in dependency order — no manual helm install or kubectl apply needed.
flowchart LR
ArgoCD(["ArgoCD"])
ArgoCD -->|"sync 1"| Secrets
ArgoCD -->|"sync 2 — after secrets healthy"| Core
ArgoCD -->|"sync 3 — after core healthy"| Workflows
Secrets["datahub-local-secrets\n───────────────\nDB passwords\nAPI keys\nTLS certs\nOAuth credentials"]
Core["datahub-local-core\n───────────────\nTraefik · cert-manager\nLonghorn · Prometheus\nLoki · Trino · Airflow\nRedpanda · Superset …"]
Workflows["datahub-local-workflows\n───────────────\nn8n flows\nAirflow DAGs\nSQLMesh models"]
| Stage | Repository | What it deploys |
|---|---|---|
| 1 — Secrets | datahub-local-secrets | Sealed Secrets — credentials needed by all platform services |
| 2 — Core | datahub-local-core | All platform services via Helmfile ApplicationSets |
| 3 — Workflows | datahub-local-workflows | n8n flows, Airflow DAGs, SQLMesh models |
Monitoring the rollout
# watch all ArgoCD applications
kubectl -n argocd get applications -w
# or via the ArgoCD UI (available after core sync)
https://argocd.<your-domain>