Skip to content

Load Balancer Configuration

This guide covers load balancer configuration for kMetal's under cluster.

When You Need This

kMetal's under cluster runs on bare metal, so a software load balancer is always required. MetalLB is the supported choice.

MetalLB Configuration

The kMetal umbrella chart includes MetalLB as a sub-chart. It's enabled by default; configuration is supplied through the metallb top-level key in your values overlay.

Basic Configuration

# kmetal-values.yaml
metallb:
  enabled: true
  pools:
    - name: default-pool
      addresses:
        - 192.168.1.100-192.168.1.200       # Adjust to your network
  l2Advertisements:
    - name: default-l2-adv
      ipAddressPools:
        - default-pool

Apply during install:

helm install kmetal oci://ghcr.io/clastix/oci/kmetal \
  --namespace kmetal-flux \
  --values kmetal-values.yaml \
  --wait --timeout=15m

Or upgrade an existing release:

helm upgrade kmetal oci://ghcr.io/clastix/oci/kmetal \
  --namespace kmetal-flux \
  --values kmetal-values.yaml \
  --wait

IP Address Pool Variations

metallb:
  pools:
    - name: range-pool
      addresses:
        - 192.168.1.100-192.168.1.200       # Range notation
    - name: cidr-pool
      addresses:
        - 10.0.0.0/24                        # CIDR notation
    - name: single-ip-pool
      addresses:
        - 192.168.1.100/32                   # Single IP

Network Configuration

  • Ensure the IP address pool doesn't conflict with your existing network infrastructure
  • IPs must be in the same subnet as your Kubernetes nodes
  • IPs should not be in use by DHCP or other services
  • Contact your network administrator if unsure about available IP ranges

Advanced Configuration

L2 Mode (Layer 2)

L2 mode is the default. MetalLB announces IP addresses using ARP/NDP:

metallb:
  pools:
    - name: default-pool
      addresses:
        - 192.168.1.100-192.168.1.200
  l2Advertisements:
    - name: default-l2-adv
      ipAddressPools:
        - default-pool
      # Restrict announcement to specific interfaces or nodes if needed:
      # interfaces: [eth0, eth1]
      # nodeSelectors:
      #   - matchLabels: { kubernetes.io/hostname: node1 }

BGP Mode (Layer 3)

For production deployments with BGP-capable routers:

t.b.d. — A worked BGP example (BGPPeer + BGPAdvertisement against the chart shape) is t.b.d. in this section. The chart accepts the upstream MetalLB BGP CR shapes; see the upstream BGP docs until this example is added.

Verification

After configuring the load balancer, verify it's working:

# Check MetalLB pods are running
kubectl get pods -n kmetal-metallb

# Check IP address pool configuration
kubectl get ipaddresspool -n kmetal-metallb

# Check L2 advertisement
kubectl get l2advertisement -n kmetal-metallb

# Test with a sample service
kubectl create service loadbalancer test-lb --tcp=80:80
kubectl get svc test-lb

# You should see an EXTERNAL-IP assigned from your pool
# Clean up test service
kubectl delete svc test-lb

Troubleshooting

No External IP Assigned

If services don't get an external IP:

# Check MetalLB controller logs
kubectl logs -n kmetal-metallb -l app.kubernetes.io/component=controller

# Check speaker logs
kubectl logs -n kmetal-metallb -l app.kubernetes.io/component=speaker

# Verify IP pool configuration
kubectl describe ipaddresspool -n kmetal-metallb

Common issues:

  • IP pool exhausted (all IPs in use)
  • IP pool conflicts with network DHCP
  • Network interface not correctly detected
  • Firewall blocking ARP/NDP traffic (L2 mode)
  • BGP session not established (BGP mode)

IP Conflict

If you see IP conflicts:

  1. Verify IP pool doesn't overlap with DHCP range
  2. Check for duplicate IPs in network
  3. Adjust IP pool range in configuration
  4. Reapply configuration

Production Recommendations

For production deployments:

  1. Use BGP mode if your network supports it (more reliable and scalable)
  2. Reserve IP ranges with your network team before deployment
  3. Use multiple IP pools for different service tiers or environments
  4. Enable speaker tolerations for nodes with special network configurations
  5. Monitor MetalLB metrics with Prometheus for operational visibility
  6. Document IP allocations for your team

Tenant Service Exposure

The MetalLB configuration above governs the under cluster — the IPs that the kMetal platform itself, and tenant control-plane endpoints, surface from. Tenant workload LoadBalancer services follow a different path that depends on which networking mode the under cluster runs.

Overlay mode — OVN Gateway

In overlay mode (the default), tenant LoadBalancer services are fulfilled by Kube-OVN itself. A custom controller, cloud-provider-ovn (CPO), watches LoadBalancer services in every tenant cluster and reconciles them into OVN resources on the under cluster:

  • OvnEip — allocates an External IP from the provider network pool.
  • OVN load balancer — DNATs traffic destined for that EIP to the tenant's backend pods, with multi-backend support.

The under cluster is not in the data path. After the load balancer is programmed, traffic enters the under cluster's provider network, hits OVN, and is DNATed directly to a worker VM's pod IP. No proxy hop, no extra latency, no per-service controller in the tenant.

VLAN mode — CCM Proxy or Tenant MetalLB

In VLAN mode, OVN Gateway isn't applicable because there is no shared provider network for OVN to allocate from. Two patterns are supported:

  1. CCM Proxy (default candidate): a kMetal-side Cloud Controller Manager proxy that allocates IPs from the tenant's VLAN address pool and announces them via ARP/BGP. Tenant LoadBalancer services point at the CCM-allocated IP; the under cluster's CCM handles the announcement.
  2. Tenant MetalLB: install MetalLB inside the tenant cluster (delivered as a tenant add-on) and let the tenant manage its own L2/BGP announcement. This pattern matches what tenants would build in a vanilla Kubernetes cluster on bare metal.

For new deployments, prefer overlay mode unless wire-level VLAN isolation is a hard requirement.

Edge router integration

The provider network IPs allocated by either pattern need to be reachable from outside the data center. kMetal supports two integration patterns with the edge router:

  • Direct Routing — the edge router holds a static route for the provider CIDR pointing at the under cluster's gateway nodes. Allocate an EIP → traffic just works. Zero per-tenant coupling between kMetal and the edge router; once the static route is in place, kMetal can add/remove tenant EIPs without touching the router.
  • Edge NAT — the edge router 1:1 NATs a public IP to each tenant EIP. Used as a fallback when the provider CIDR cannot be routed directly (NAT'd network, ISP constraints). Requires per-EIP router config; operationally heavier.

IP pool models

Tenant EIPs are allocated from one of two pool shapes:

  • Global pool (default) — a single provider-network pool shared by all tenants. Simple, no per-tenant network setup, but anyone with under-cluster admin can see who has which IPs.
  • Per-tenant pool — each tenant gets a dedicated provider subnet, and their EIPs come from that subnet only. Stronger isolation, but currently blocked by an upstream OVN bug (kube-ovn#3329, ovn-org/ovn#222) when combined with the multi-DGP topology kMetal uses. The global pool is the only validated option for v1.

Next Steps


Status: Active Configuration Guide