Stretched Clusters¶
A stretched kMetal under cluster spans more than one site so tenant control planes stay available when a zone, data center, or region goes offline. Three deployment patterns are supported.
Pattern A — 3-zone distributed under cluster¶
All three under-cluster CP nodes sit in three separate failure zones (rooms, racks, buildings, or short-haul DC pairs). Each zone carries worker nodes for tenant VMs.
Zone 1 Zone 2 Zone 3
[ CP-1, W-1 ] [ CP-2, W-2 ] [ CP-3, W-3 ]
\ | /
\ | /
'------ ≤ 10 ms RTT ------'
- Sites needed: 3 on-prem zones.
- RTT budget: ≤ 10 ms between zones (etcd Raft is synchronous).
- Failure tolerance: any single zone can disappear; quorum preserved.
- Where it fits: campus-scale, multi-room data centers, co-located DC pairs with low-latency inter-site links.
Pattern B — 2 data zones + cloud arbiter¶
Two on-prem zones plus a cloud VM as the etcd arbiter.
- Sites needed: 2 on-prem + 1 cloud VM.
- RTT budget: ≤ 10 ms between the two on-prem zones; cloud arbiter tolerates higher RTT (quorum traffic only).
- Failure tolerance: any one of the three can fail.
- Where it fits: organizations with two real data centers but not three. The cloud node hosts no tenant VMs.
Pattern C — Under-cluster CP on cloud, workers on-prem¶
The under-cluster CP nodes run in a cloud region. Compute hosts running tenant VMs stay on-prem. Workers connect outbound to the cloud control plane via Konnectivity.
- Sites needed: 1 on-prem site + 1 cloud account.
- RTT budget: WAN-tolerant (worker → CP only; no synchronous Raft across the WAN).
- No inbound firewall on the on-prem side — the tunnel is initiated by the worker.
- Where it fits: edge, branch, or factory deployments where the on-prem footprint is small and the management plane runs centrally in a cloud account.
Picking a pattern¶
| Concern | Pattern A | Pattern B | Pattern C |
|---|---|---|---|
| Sites | 3 | 2 + cloud VM | 1 + cloud |
| RTT requirement (worst link) | ≤ 10 ms | ≤ 10 ms between on-prem zones | WAN-tolerant |
| Tenant data plane crosses WAN? | No | No | No |
| Fit | Metro / campus | Two-DC orgs | Edge / branch |
In all three, the under cluster's etcd is synchronously replicated across the participating zones — committed writes to tenant control planes survive site loss with zero RPO. RTO is bounded by Raft re-election plus pod re-scheduling.