Platform Validation¶
After installing kMetal, validate that the under cluster's platform components are healthy before provisioning tenant clusters.
Helm release¶
The release should be deployed. Check that the revision and chart version match what you intended to install.
Component pods¶
The chart deploys components into the following namespaces. Each should have all pods in Running or Completed:
kubectl get pods -n kmetal-cert-manager
kubectl get pods -n kube-flannel # flannel
kubectl get pods -n kube-system # kube-ovn + multus
kubectl get pods -n kmetal-metallb
kubectl get pods -n kmetal-kamaji # kamaji + kamaji-addon-ovn
kubectl get pods -n system-kubevirt
kubectl get pods -n system-cdi
kubectl get pods -n kmetal-capi-providers # core + kubeadm + CAPK + CACPK
kubectl get pods -n kmetal-local-path-storage
# Anything not Running or Completed
kubectl get pods -A | grep -v Running | grep -v Completed
MetalLB¶
# Pool(s) should match the overlay you applied
kubectl get ipaddresspool -n kmetal-metallb
kubectl get l2advertisement -n kmetal-metallb
# Smoke-test allocation
kubectl create service loadbalancer test-lb --tcp=80:80
kubectl get service test-lb -w # should get EXTERNAL-IP from the pool
kubectl delete service test-lb
Default StorageClass¶
kubectl get storageclass
# One StorageClass should be marked (default) — local-path by chart default.
CAPI providers¶
kubectl get coreproviders,bootstrapproviders,infrastructureproviders,controlplaneproviders -n kmetal-capi-providers
# Expect (one of each): core (cluster-api), bootstrap-kubeadm, infrastructure-kubevirt (CAPK), control-plane-kamaji (CACPK)
KubeVirt readiness¶
CRDs¶
Spot-check a few platform CRDs are installed:
kubectl get crd tenantcontrolplanes.kamaji.clastix.io
kubectl get crd clusters.cluster.x-k8s.io
kubectl get crd virtualmachines.kubevirt.io
kubectl get crd subnets.kubeovn.io
Troubleshooting¶
If any component fails:
# Pod status in the affected namespace
kubectl get pods -n <namespace> --sort-by=.metadata.creationTimestamp
# Events
kubectl get events -A --sort-by='.lastTimestamp' | tail -50
# Resource pressure
kubectl top nodes
kubectl top pods -A --sort-by=cpu | head -20
See Troubleshooting for component-specific guidance.