Build a 3-Node Self-Managed Kubernetes Cluster on AWS EC2

This guide shows how to build a 3-node Kubernetes cluster on AWS EC2 where:

  • All 3 nodes are both control-plane and worker (stacked etcd).
  • Nodes live only in a private subnet (no public IPs).
  • An existing OPNsense instance in the same VPC acts as:
  • Default gateway / NAT
  • Firewall
  • Load balancer (HAProxy) for:
    • Kubernetes API (internal only)
    • Traefik ingress (public apps)
  • DNS server (BIND) for internal names

We do not use:

  • EKS (no managed control-plane cost)
  • AWS NLB / ALB (no LB hourly + data charges)
  • MetalLB inside the cluster

Instead, we treat OPNsense as the “edge” load balancer in front of a private kubeadm cluster.

This is still a lab: OPNsense is a single point of ingress, but the internal Kubernetes design is very close to a production cluster.


Why not MetalLB or kube-vip on AWS?

In my bare-metal labs, I usually use:

  • kube-vip for the Kubernetes API VIP, and
  • MetalLB to implement type: LoadBalancer Services.

On AWS in this design, we don’t use either, for a few reasons:

  • MetalLB + AWS VPC is awkward
    MetalLB’s simple Layer-2 mode relies on answering ARP/NDP for arbitrary IPs on the LAN. In an AWS VPC, you don’t control L2 like that — IP ownership is tied to ENIs and routing tables, not raw ARP. MetalLB can run in BGP mode, but that requires a BGP-capable router setup that we’re not doing here.
  • OPNsense is already our “cloud load balancer”
    OPNsense terminates the public EIP, does NAT, runs HAProxy, and sits at 10.0.128.4 on the LAN. It already:
    • load balances the Kubernetes API across 10.0.128.7/8/9, and
    • load balances Traefik’s NodePorts (30080/30443) for app traffic.
      Adding MetalLB inside the cluster would just create a second, unnecessary LB layer.
  • kube-vip would duplicate what HAProxy already does
    kube-vip is mainly used to provide a virtual IP for the control-plane nodes. In this setup, the control-plane VIP is effectively:
    • k8s-aws.maksonlee.com → 10.0.128.4 (OPNsense) → 6443 on all 3 nodes
      That’s already highly available at the edge. Putting kube-vip inside the cluster would add extra moving parts without improving availability for this AWS + OPNsense design.

The result is: the cluster stays simple (kubeadm + Calico + Traefik), and OPNsense plays the role that MetalLB/kube-vip normally play in a bare-metal lab.


  1. Lab Topology

Assumed AWS environment

You already have:

  • VPC: 10.0.0.0/16
  • Public subnet (example): 10.0.0.0/20
  • OPNsense (firewall / LB)
    • WAN IP: 10.0.0.4
    • Elastic IP (EIP): 3.109.96.219 (public)
  • Private subnet: 10.0.128.0/20
    • OPNsense LAN IP: 10.0.128.4

Routing:

  • The route table for 10.0.128.0/20 sends 0.0.0.0/0 to OPNsense as an instance target, with source/destination check disabled on that EC2 instance.
  • OPNsense does NAT for instances in 10.0.128.0/20.

OPNsense provides:

  • Firewall
  • BIND DNS
  • HAProxy (SSL offload and/or TCP passthrough for services)

Kubernetes Nodes & Hostnames

We’ll run three EC2 instances in the private subnet:

HostnameIPRole
k8s-aws-1.maksonlee.com10.0.128.7control-plane + worker
k8s-aws-2.maksonlee.com10.0.128.8control-plane + worker
k8s-aws-3.maksonlee.com10.0.128.9control-plane + worker

We’ll also use these logical names:

  • Kubernetes API VIP (internal only)
    • k8s-aws.maksonlee.com → 10.0.128.4 (OPNsense LAN IP)
  • Ingress hostnames (apps exposed via Traefik NodePort through HAProxy):
    • app1-aws.maksonlee.com
    • app2-aws.maksonlee.com

From the Internet:

  • app1-aws.maksonlee.com, app2-aws.maksonlee.com → EIP 3.109.96.219 → OPNsense WAN

From inside the VPC / VPN / internal clients:

  • k8s-aws.maksonlee.com10.0.128.4
  • app1-aws.maksonlee.com, app2-aws.maksonlee.com10.0.128.4
  • Nodes → 10.0.128.7/8/9

Important: Kubernetes API is not exposed on a public DNS name. k8s-aws.maksonlee.com only exists in the internal DNS view.


  1. DNS Plan (BIND on OPNsense + Public DNS)

Internal zone (maksonlee.com, BIND on OPNsense)

In your internal maksonlee.com zone:

; OPNsense LAN – API VIP & ingress VIP inside VPC
k8s-aws      IN A   10.0.128.4

; Kubernetes nodes
k8s-aws-1    IN A   10.0.128.7
k8s-aws-2    IN A   10.0.128.8
k8s-aws-3    IN A   10.0.128.9

; App hostnames – also go to OPNsense LAN for internal clients
app1-aws     IN A   10.0.128.4
app2-aws     IN A   10.0.128.4

So from VPC / VPN:

  • k8s-aws.maksonlee.com10.0.128.4
  • app1-aws.maksonlee.com, app2-aws.maksonlee.com10.0.128.4

Public DNS (Cloudflare)

On public DNS, you only expose app hostnames:

; No public record for k8s-aws (API stays internal)

app1-aws   IN A   3.109.96.219
app2-aws   IN A   3.109.96.219

From the Internet:

  • Users hit app1-aws.maksonlee.com / app2-aws.maksonlee.com → EIP 3.109.96.219 → OPNsense → HAProxy → Traefik NodePort → apps.

  1. Launch the Three Kubernetes Nodes (t4g.small)

Use ARM-based Graviton instances:

  • Instance type: t4g.small
    • 2 vCPUs
    • 2 GiB RAM
    • EBS only
    • Up to 5 Gbps network bandwidth
  • AMI: Ubuntu Server 24.04 LTS (ARM64)
  • Subnet: 10.0.128.0/20 (private)
  • Auto-assign public IP: Disabled

Set private IPs:

  • k8s-aws-1.maksonlee.com10.0.128.7
  • k8s-aws-2.maksonlee.com10.0.128.8
  • k8s-aws-3.maksonlee.com10.0.128.9

Set hostnames:

# On k8s-aws-1
sudo hostnamectl set-hostname k8s-aws-1.maksonlee.com

# On k8s-aws-2
sudo hostnamectl set-hostname k8s-aws-2.maksonlee.com

# On k8s-aws-3
sudo hostnamectl set-hostname k8s-aws-3.maksonlee.com

Security group for Kubernetes nodes

For this lab, all the instances (OPNsense and the three Kubernetes nodes) use the same default security group.

The default SG is configured as:

  • Inbound: All protocols / All ports from the same security group.
  • Outbound: All protocols / All ports to 0.0.0.0/0.

This means:

  • Nodes can talk to each other on all ports (required for kubelet / etcd / Calico).
  • OPNsense (same SG) can reach:
    • 6443 on all nodes (Kubernetes API)
    • NodePort range including 30080 and 30443 (Traefik)
  • Nothing outside this SG can directly reach the nodes.

Because the nodes are in a private subnet and have no public IPs, they are still only reachable via OPNsense, so this is acceptable for a lab.

If you want a stricter, more production-like setup, you could:

  • Create a dedicated SG for the nodes.
  • Allow:
    • TCP 6443 from the OPNsense SG (for API)
    • TCP 30080 and 30443 from the OPNsense SG (for Traefik NodePorts)
    • All traffic within the node SG itself.

  1. Prepare All Nodes for Kubernetes

Run this on all three nodes (k8s-aws-1, k8s-aws-2, k8s-aws-3).

  • Disable swap & configure kernel
sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstab

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

sudo sysctl --system
  • Install containerd
sudo apt update && sudo apt install -y ca-certificates curl gnupg lsb-release
 
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
 
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \
https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
 
sudo apt update && sudo apt install -y containerd.io
  • Configure containerd for SystemdCgroup
sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml >/dev/null

Edit:

sudo vi /etc/containerd/config.toml

Find:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = false

Change to:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

Restart:

sudo systemctl restart containerd
sudo systemctl enable containerd
  • Install kubeadm, kubelet, kubectl (v1.34)

Add Kubernetes apt repo (v1.34)

sudo apt update
sudo apt install -y apt-transport-https ca-certificates curl gpg
 
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.34/deb/Release.key \
  | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
 
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.34/deb/ /
EOF

Install tools

sudo apt update
sudo apt install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
sudo systemctl enable --now kubelet

  1. HAProxy on OPNsense for API and Traefik

On OPNsense, you already have frontends/backends for other services (maksonlee.com, keycloak, etc.).
This section shows only the Kubernetes-related part of the config.

frontend frontend-https
    bind 0.0.0.0:443 name 0.0.0.0:443 
    mode tcp

    tcp-request inspect-delay 5s
    tcp-request content accept if { req.ssl_hello_type 1 }

    acl is_k8s_traefik_tls req.ssl_sni -i app1-aws.maksonlee.com
    acl is_k8s_traefik_tls req.ssl_sni -i app2-aws.maksonlee.com
    
    use_backend k8s_traefik_https if is_k8s_traefik_tls

frontend frontend-http
    bind 0.0.0.0:80 name 0.0.0.0:80 
    mode http
    option http-keep-alive

    acl is_k8s_traefik_http hdr(host) -i app1-aws.maksonlee.com
    acl is_k8s_traefik_http hdr(host) -i app2-aws.maksonlee.com
    
    use_backend k8s_traefik_http if is_k8s_traefik_http

frontend frontend-k8s-api
    bind 10.0.128.4:6443 name 10.0.128.4:6443 
    mode tcp
    default_backend k8s_api

backend k8s_api
    mode tcp
    balance roundrobin

    stick-table type ip size 50k expire 30m  

    server k8s-aws-1-api 10.0.128.7:6443 
    server k8s-aws-2-api 10.0.128.8:6443 
    server k8s-aws-3-api 10.0.128.9:6443 

backend k8s_traefik_http
    mode http
    balance roundrobin

    stick-table type ip size 50k expire 30m  

    http-reuse safe
    server k8s-aws-1-traefik-http 10.0.128.7:30080 
    server k8s-aws-2-traefik-http 10.0.128.8:30080 
    server k8s-aws-3-traefik-http 10.0.128.9:30080 

backend k8s_traefik_https
    mode tcp
    balance roundrobin

    stick-table type ip size 50k expire 30m  

    server k8s-aws-1-traefik-https 10.0.128.7:30443 
    server k8s-aws-2-traefik-https 10.0.128.8:30443 
    server k8s-aws-3-traefik-https 10.0.128.9:30443

Explanation:

  • frontend-https (TCP, port 443):
    Peeks at TLS SNI and sends app1-aws.maksonlee.com / app2-aws.maksonlee.com to k8s_traefik_https (TCP passthrough to Traefik’s websecure NodePort, 30443).
  • frontend-http (HTTP, port 80):
    Routes by Host header and sends those same hosts to k8s_traefik_http (NodePort 30080).
  • frontend-k8s-api (TCP, 10.0.128.4:6443):
    Internal API VIP → k8s_api backend, round-robin across all 3 control-planes.

Your existing frontends/backends for other apps remain unchanged; you just add these Kubernetes sections.


  1. kubeadm Init on k8s-aws-1

On k8s-aws-1, create kubeadm-config.yaml:

apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
clusterName: aws-selfmanaged
controlPlaneEndpoint: "k8s-aws.maksonlee.com:6443"
apiServer:
  certSANs:
    - k8s-aws.maksonlee.com
    - k8s-aws-1.maksonlee.com
    - k8s-aws-2.maksonlee.com
    - k8s-aws-3.maksonlee.com
networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12

Init the cluster:

sudo kubeadm init --config kubeadm-config.yaml --upload-certs

Configure kubectl on k8s-aws-1:

mkdir -p $HOME/.kube
sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown "$(id -u):$(id -g)" $HOME/.kube/config

  1. Install Calico

On k8s-aws-1:

curl -LO https://raw.githubusercontent.com/projectcalico/calico/v3.31.2/manifests/calico.yaml
kubectl apply -f calico.yaml

You do not need to edit calico.yaml for the pod CIDR:

  • In this manifest, the CALICO_IPV4POOL_CIDR block is commented out.
  • With kubeadm, Calico auto-detects the pod CIDR from the cluster configuration (podSubnet: 10.244.0.0/16).

Verify:

kubectl get pods -n kube-system

Wait until all calico-node pods are Running.


  1. Join k8s-aws-2 and k8s-aws-3 as Control-Planes

After kubeadm init --upload-certs, kubeadm prints two kubeadm join commands:

  • one for worker nodes
  • one for additional control-plane nodes, which already includes --control-plane and --certificate-key <CERT_KEY>

On k8s-aws-2 and k8s-aws-3, run the control-plane join command that kubeadm printed. It looks like:

sudo kubeadm join k8s-aws.maksonlee.com:6443 \
  --token <TOKEN> \
  --discovery-token-ca-cert-hash sha256:<CA_HASH> \
  --control-plane \
  --certificate-key <CERT_KEY>

Check all nodes:

kubectl get nodes -o wide

You should see three control-plane nodes with internal IPs 10.0.128.7, .8, .9.

Allow Control-Planes to Run Workloads

For this lab, you want all three nodes to schedule workloads:

kubectl taint nodes k8s-aws-1.maksonlee.com node-role.kubernetes.io/control-plane-
kubectl taint nodes k8s-aws-2.maksonlee.com node-role.kubernetes.io/control-plane-
kubectl taint nodes k8s-aws-3.maksonlee.com node-role.kubernetes.io/control-plane-

  1. Install Traefik via Helm

On k8s-aws-1:

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

helm repo add traefik https://traefik.github.io/charts
helm repo update

Create traefik-values.yaml:

deployment:
  replicas: 3

service:
  type: NodePort
  spec:
    externalTrafficPolicy: Local

ports:
  web:
    port: 80
    nodePort: 30080
  websecure:
    port: 443
    nodePort: 30443

Install / upgrade Traefik:

helm upgrade --install traefik traefik/traefik \
  --namespace traefik --create-namespace \
  -f traefik-values.yaml

Check pods & svc:

kubectl get pods -n traefik -o wide
kubectl get svc -n traefik traefik

  1. Deploy a Sample App (whoami) and Ingress

Create namespace + deployment + service:

kubectl create namespace demo

kubectl create deployment whoami \
  --namespace demo \
  --image traefik/whoami \
  --replicas 3

kubectl expose deployment whoami \
  --namespace demo \
  --port 80 --target-port 80

Create whoami-ingress.yaml:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: whoami
  namespace: demo
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  ingressClassName: traefik
  rules:
  - host: app1-aws.maksonlee.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: whoami
            port:
              number: 80

Apply:

kubectl apply -f whoami-ingress.yaml
  • Test from inside the cluster

On k8s-aws-1:

kubectl get svc -n traefik traefik
# Suppose ClusterIP is 10.103.26.76

# Test via ClusterIP:
curl -H "Host: app1-aws.maksonlee.com" http://10.103.26.76/

# Test via NodePort on each node:
curl -H "Host: app1-aws.maksonlee.com" http://10.0.128.7:30080/
curl -H "Host: app1-aws.maksonlee.com" http://10.0.128.8:30080/
curl -H "Host: app1-aws.maksonlee.com" http://10.0.128.9:30080/

You should see the whoami response each time.

  • Test from OPNsense (direct to NodePort)

On the OPNsense shell:

curl -v -H "Host: app1-aws.maksonlee.com" http://10.0.128.7:30080/

If you get a whoami response, it confirms:

  • NodePort is reachable
  • Security Group rules are correct
  • K8s / Traefik / Ingress path is healthy
  • End-to-end test from your PC (through HAProxy)

From your PC:

curl http://app1-aws.maksonlee.com/

You should see something like:

Hostname: whoami-b85fc56b4-6mg8j
IP: 127.0.0.1
IP: ::1
IP: 10.244.166.198
...
X-Forwarded-For: 10.0.128.4
X-Forwarded-Host: app1-aws.maksonlee.com
X-Forwarded-Port: 80
X-Forwarded-Proto: http
X-Forwarded-Server: traefik-65cc567666-xxxxx
X-Real-Ip: 10.0.128.4

This shows:

  • External client → EIP (3.109.96.219) → OPNsense HAProxy
  • HAProxy frontend matches Host: app1-aws.maksonlee.comk8s_traefik_http backend
  • Backend load balances across the three NodePorts (:30080)
  • Traefik forwards to the whoami pods

Did this guide save you time?

Support this site

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top