Skip to content

Platform Validation

After installing kMetal, validate that the under cluster's platform components are healthy before provisioning tenant clusters.

Helm release

helm status kmetal -n kmetal-flux
helm history kmetal -n kmetal-flux

The release should be deployed. Check that the revision and chart version match what you intended to install.

Component pods

The chart deploys components into the following namespaces. Each should have all pods in Running or Completed:

kubectl get pods -n kmetal-cert-manager
kubectl get pods -n kube-flannel       # flannel
kubectl get pods -n kube-system        # kube-ovn + multus
kubectl get pods -n kmetal-metallb
kubectl get pods -n kmetal-kamaji          # kamaji + kamaji-addon-ovn
kubectl get pods -n system-kubevirt
kubectl get pods -n system-cdi
kubectl get pods -n kmetal-capi-providers         # core + kubeadm + CAPK + CACPK
kubectl get pods -n kmetal-local-path-storage

# Anything not Running or Completed
kubectl get pods -A | grep -v Running | grep -v Completed

MetalLB

# Pool(s) should match the overlay you applied
kubectl get ipaddresspool -n kmetal-metallb
kubectl get l2advertisement -n kmetal-metallb

# Smoke-test allocation
kubectl create service loadbalancer test-lb --tcp=80:80
kubectl get service test-lb -w     # should get EXTERNAL-IP from the pool
kubectl delete service test-lb

Default StorageClass

kubectl get storageclass
# One StorageClass should be marked (default) — local-path by chart default.

CAPI providers

kubectl get coreproviders,bootstrapproviders,infrastructureproviders,controlplaneproviders -n kmetal-capi-providers
# Expect (one of each): core (cluster-api), bootstrap-kubeadm, infrastructure-kubevirt (CAPK), control-plane-kamaji (CACPK)

KubeVirt readiness

kubectl get kubevirt -n system-kubevirt
# .status.phase should be Deployed

CRDs

Spot-check a few platform CRDs are installed:

kubectl get crd tenantcontrolplanes.kamaji.clastix.io
kubectl get crd clusters.cluster.x-k8s.io
kubectl get crd virtualmachines.kubevirt.io
kubectl get crd subnets.kubeovn.io

Troubleshooting

If any component fails:

# Pod status in the affected namespace
kubectl get pods -n <namespace> --sort-by=.metadata.creationTimestamp

# Events
kubectl get events -A --sort-by='.lastTimestamp' | tail -50

# Resource pressure
kubectl top nodes
kubectl top pods -A --sort-by=cpu | head -20

See Troubleshooting for component-specific guidance.

Next Steps