Automating Highly Available Kubernetes and external ETCD cluster setup with terraform and kubeadm on AWS.

Today I am going to show how you can fully automate the advanced process of setting up the highly available k8s cluster in the cloud. We will go through a set of terraform and bash scripts which should be sufficient enough for you to literally just run terraform plan/apply to get your HA etcd and k8s cluster up and running without any hassle around.

Part 0 – Intro.
Part 1 – Setting up HA ETCD cluster.
Part 2 – The PKI infra
Part 3 – Setting up k8s cluster.

Part 0 – Intro.

If you do a short research on how to setup k8s cluster you may find quite a lot of ways this could be achieved.
But in general, all this ways could be grouped into 3 types:

1) No setup
2) Easy Set up
3) Advanced Set up
4) Hard way

By No setup I simply mean something like EKS, it is a managed service, you don’t need to maintain or care about details while AWS will do all for you. Never used it can’t say much on that one.

Easy setup, tools like kops and alike make it quite easy – couple commands run kinda setup:

kops ~]$ kops create cluster \
  --name=k8s.ifritltd.net --state=s3://kayan-kops-state \
  --zones="eu-west-2a" --node-count=2 --node-size=t2.micro 
  --master-size=t2.micro --dns-zone=k8s.ifritltd.net  --cloud aws

All you need is setup s3 bucket and dns records and run the command above which I described two years ago in this article

The downside is first of all it is mainly only for AWS, and generates all AWS resources as it wants, so lets say it would generate security groups, asg, etc in it’s own way which means
if you already have terraform managed infra with your own rules, strategies and framework, it won’t feet into that model but just added as some kind of alien infra. Long story short if you want fine grained control over how your infra should be managed from single centralised terraform, it isn’t best solution, yet still easy and balanced tool.

Before I start explaining how to use Advanced Set up, I am just going to mention that 4th, The Hard way is probably only good if you want to learn how k8s works, how all components interact with each other, and as it doesn’t use any external tool to set up components, you do everything manually, you literally know all the guts of the system. Obviously it could become a nightmare to support such system in production unless all members of the ops team are k8s experts or there are some requirements not supported by other bootstrapping tools.

Finally the Advanced Set up.

While we are going to use kubeadm to setup HA k8s cluster, there are also two ways of doing that:
1) The Stacked etcd topology, where the etcd cluster is just part of k8s cluster and
2) External etcd topology:

So what is etcd – think of it like database, the data layer of k8s cluster. Obviously external ectd is more secure due to separation of data and service layers, can be even managed by other team, if needed.

If one day your cluster is broken, you literally can bring up a new one from the scratch and all the data of the api server will just be restored from etcd, hence this is the most preferred option. Whereas in the stacked version etcd is just a docker container running along with other components as part of control pane.

Before we start setting up our cluster, we need to set up etcd cluster in HA way, so our k8s cluster can use it as external database.

Part 1 – Setting up HA ETCD cluster.

As I have already mentioned I am going to use terraform, our cluster will need the next resources:

DNS
Ubuntu server running etcd server and PKI infra installed.
Disk volume to store the data permanently

So straight to the etc module code:

tail -n 1000 *.tf
==> data.tf <==
data "aws_ami" "ubuntu_1604" {
  most_recent = true
  name_regex  = "ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-[0-9]*"
}

data "aws_route53_zone" "zone" {
  name = "k8s.ifritltd.co.uk."
}

==> ebs.tf <==
resource "aws_ebs_volume" "ebs-volume" {
  availability_zone = "${var.availability_zone}"
  size              = "2"

  tags {
    Name        = "ebs_etcd_${var.zone_suffix}"
  }

  lifecycle {
    prevent_destroy = false
    //todo enable back when finished
  }
}

==> main.tf <==
resource "aws_instance" "etcd" {
  ami           = "${data.aws_ami.ubuntu_1604.id}"
  instance_type = "t2.micro"
  key_name      = "terra"
  user_data     = "${file("etcd.sh")}"
  availability_zone = "${var.availability_zone}"

  vpc_security_group_ids = [
    "${var.sg_id}",
  ]

  iam_instance_profile = "${var.iam_instance_profile}"

  tags {
    Name = "example etcd ${var.zone_suffix}"
  }
}


resource "aws_route53_record" "etcd" {
  zone_id = "${var.zone_id}"
  name    = "etcd-${var.zone_suffix}"
  type    = "A"
  ttl     = "300"
  records = ["${aws_instance.etcd.private_ip}"]
}

==> output.tf <==
output "public_ip" {
  value = "${aws_instance.etcd.public_ip}"
}

As you see from the code above we define an instance running Ubuntu 16.04 backed with ebs and register it’s ip with dns record.

Now we can use the module to create resources in all 3 AZs in London region:

➜  etcd git:(master) ✗ tail -n 1000 *.tf


==> main.tf <==
module "etcd-a" {
  source               = "module/etcd"
  availability_zone    = "eu-west-2a"
  zone_suffix          = "a"
  iam_instance_profile = "${aws_iam_instance_profile.aws_iam_instance_profile.name}"
  sg_id                = "${aws_security_group.etcd.id}"
  zone_id              = "${aws_route53_zone.k8s_private_zone.zone_id}"
}

module "etcd-b" {
  source               = "module/etcd"
  availability_zone    = "eu-west-2b"
  zone_suffix          = "b"
  iam_instance_profile = "${aws_iam_instance_profile.aws_iam_instance_profile.name}"
  sg_id                = "${aws_security_group.etcd.id}"
  zone_id              = "${aws_route53_zone.k8s_private_zone.zone_id}"
}

module "etcd-c" {
  source               = "module/etcd"
  availability_zone    = "eu-west-2c"
  zone_suffix          = "c"
  iam_instance_profile = "${aws_iam_instance_profile.aws_iam_instance_profile.name}"
  sg_id                = "${aws_security_group.etcd.id}"
  zone_id              = "${aws_route53_zone.k8s_private_zone.zone_id}"
}

resource "aws_route53_zone" "k8s_private_zone" {
  name = "k8s.ifritltd.co.uk"

  vpc {
    vpc_id = "${data.aws_vpc.default.id}"
  }
}


==> security-groups.tf <==

resource "aws_security_group_rule" "all-internal" {
  type              = "ingress"
  security_group_id = "${aws_security_group.etcd.id}"
  self              = true
  from_port         = 2379
  to_port           = 2380
  protocol          = "tcp"
}

As you can see we created 3 instances in private dns zone k8s.ifritltd.co.uk and the nodes exposed as https://etcd-a.k8s.ifritltd.co.uk:2379 in each zone.

You also need to grant some permission to your instances in the IAM:

 "ec2:DescribeVolumes",
 "ec2:AttachVolume",
 "ssm:PutParameter",
 "ssm:GetParametersByPath",
 "ssm:GetParameters",
 "ssm:GetParameter"

With this, ec2 can get ebs attached and use ssm for storing pki. The cloud init is probably most important part once infra is set up:

#!/bin/bash

set -euxo pipefail

sleep 30

AVAILABILITY_ZONE=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | cut -b 10)

apt-get update
apt-get -y install wget python-pip

locale-gen en_GB.UTF-8
pip install --no-cache-dir awscli

VOLUME_ID=$(aws ec2 describe-volumes --filters "Name=status,Values=available"  Name=tag:Name,Values=ebs_etcd_$AVAILABILITY_ZONE --query "Volumes[].VolumeId" --output text --region eu-west-2)

INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)

aws ec2 attach-volume --region eu-west-2 \
              --volume-id "${VOLUME_ID}" \
              --instance-id "${INSTANCE_ID}" \
              --device "/dev/xvdf"

while [ -z $(aws ec2 describe-volumes --filters "Name=status,Values=in-use"  Name=tag:Name,Values=ebs_etcd_$AVAILABILITY_ZONE --query "Volumes[].VolumeId" --output text --region eu-west-2) ] ; do sleep 10; echo "ebs not ready"; done

sleep 5

if [[ -z $(blkid /dev/xvdf) ]]; then
  mkfs -t ext4 /dev/xvdf  
fi

mkdir -p /opt/etcd
mount /dev/xvdf /opt/etcd


ETCD_VERSION="v3.3.8"
ETCD_URL="https://github.com/coreos/etcd/releases/download/${ETCD_VERSION}/etcd-${ETCD_VERSION}-linux-amd64.tar.gz"
ETCD_CONFIG=/etc/etcd


apt-get update
apt-get -y install wget python-pip
pip install --no-cache-dir awscli

useradd etcd

wget ${ETCD_URL} -O /tmp/etcd-${ETCD_VERSION}-linux-amd64.tar.gz
tar -xzf /tmp/etcd-${ETCD_VERSION}-linux-amd64.tar.gz -C /tmp
install --owner root --group root --mode 0755     /tmp/etcd-${ETCD_VERSION}-linux-amd64/etcd /usr/bin/etcd
install --owner root --group root --mode 0755     /tmp/etcd-${ETCD_VERSION}-linux-amd64/etcdctl /usr/bin/etcdctl
install -d --owner root --group root --mode 0755 ${ETCD_CONFIG}

cat > /etc/systemd/system/etcd.service <<EOF
[Unit]
Description=etcd key-value store

[Service]
User=etcd
Type=notify
ExecStart=/usr/bin/etcd --config-file=/etc/etcd/etcd.conf
Restart=always
RestartSec=10s
LimitNOFILE=40000

[Install]
WantedBy=ready.target

EOF

chmod 0644 /etc/systemd/system/etcd.service


mkdir -p /opt/etcd/data
chown -R etcd:etcd /opt/etcd


cat > /etc/etcd/etcd.conf <<EOF

name: 'etcd-AZONE.k8s.ifritltd.co.uk'
data-dir: /opt/etcd/data
listen-peer-urls: https://0.0.0.0:2380
listen-client-urls: https://0.0.0.0:2379
..
....
.....
initial-advertise-peer-urls: https://etcd-AZONE.k8s.ifritltd.co.uk:2380
advertise-client-urls: https://etcd-AZONE.k8s.ifritltd.co.uk:2379
discovery-fallback: 'proxy'
initial-cluster: 'etcd-a.k8s.ifritltd.co.uk=https://etcd-a.k8s.ifritltd.co.uk:2380,etcd-b.k8s.ifritltd.co.uk=https://etcd-b.k8s.ifritltd.co.uk:2380,etcd-c.k8s.ifritltd.co.uk=https://etcd-c.k8s.ifritltd.co.uk:2380'
...
....
...
client-transport-security:
  cert-file: /etc/ssl/server.pem
  key-file: /etc/ssl/server-key.pem
  client-cert-auth: false
  trusted-ca-file: /etc/ssl/certs/ca.pem
  auto-tls: false
peer-transport-security:
  cert-file: /etc/ssl/server.pem
  key-file: /etc/ssl/server-key.pem
  peer-client-cert-auth: false
  trusted-ca-file: /etc/ssl/certs/ca.pem
  auto-tls: false
....
......
EOF

sed -i s~AZONE~$AVAILABILITY_ZONE~g /etc/etcd/etcd.conf


aws ssm get-parameters --names "etcd-ca" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/ssl/certs/ca.pem 
aws ssm get-parameters --names "etcd-server" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/ssl/server.pem
aws ssm get-parameters --names "etcd-server-key" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/ssl/server-key.pem

chmod 0600  /etc/ssl/server-key.pem
chmod 0644 /etc/ssl/server.pem
chown etcd:etcd /etc/ssl/server-key.pem
chown etcd:etcd /etc/ssl/server.pem

systemctl enable etcd
systemctl start etcd

So what is happening here. First of all, we install some tools like python, pip, awscli. Then we attach EBS volume, create filesystem on it then mount it to /opt/etcd, same location we later specify in etcd config as data folder. We also advertise peers in the config by replacing AZONE with AZ where instance is running, and finally, we pull PKI infra, which we need to preinstall prior to spinning up our instances. The less important steps in etcd config are skipped but available in the source code

Part 2 – The PKI infra

The whole process is described in here
https://coreos.com/os/docs/latest/generate-self-signed-certificates.html
https://coreos.com/etcd/docs/latest/getting-started-with-etcd.html
Here are the steps to reproduce it:


brew install cfssl

cfssl print-defaults config > ca-config.json
cfssl print-defaults csr > ca-csr.json

cfssl gencert -initca ca-csr.json | cfssljson -bare ca -

cfssl print-defaults csr > server.json
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server

And the config files used:


 tail  -n 100 *.json
==> ca-config.json <==
{
    "signing": {
        "default": {
            "expiry": "43800h"
        },
        "profiles": {
            "server": {
                "expiry": "43800h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth"
                ]
            },
            "client": {
                "expiry": "43800h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "client auth"
                ]
            },
            "peer": {
                "expiry": "43800h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            }
        }
    }
}

==> ca-csr.json <==
{
    "CN": "IfritLTD CA",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "US",
            "L": "CA",
            "O": "IfritLTD",
            "ST": "San Francisco",
            "OU": "Org Unit 1",
            "OU": "Org Unit 2"
        }
    ]
}

==> server.json <==
{
    "CN": "etcd.k8s.ifritltd.co.uk",
    "hosts": [
        "etcd1.k8s.ifritltd.co.uk",
        "etcd2.k8s.ifritltd.co.uk",
        "etcd3.k8s.ifritltd.co.uk"
    ],
    "key": {
        "algo": "ecdsa",
        "size": 256
    },
    "names": [
        {
            "C": "US",
            "ST": "CA",
            "L": "San Francisco"
        }
    ]
}

If you fancy you can also use openssl for the above, the important thing you need to generate CA, then cert and server key signed by CA and having 3 SANs per node, once you done you will have bunch of files:

ll
total 80
-rw-r--r--  1 kayanazimov  staff   832B 25 Nov 17:24 ca-config.json
-rw-r--r--@ 1 kayanazimov  staff   307B 25 Nov 17:31 ca-csr.json
-rw-------  1 kayanazimov  staff   1.6K 25 Nov 17:32 ca-key.pem
-rw-r--r--  1 kayanazimov  staff   1.0K 25 Nov 17:32 ca.csr
-rw-r--r--  1 kayanazimov  staff   1.3K 25 Nov 17:32 ca.pem
-rw-r--r--  1 kayanazimov  staff     0B 25 Nov 18:48 cacert.pem
-rw-r--r--  1 kayanazimov  staff   2.1K 25 Nov 17:52 cert.crt
-rw-------  1 kayanazimov  staff   227B 25 Nov 18:16 server-key.pem
-rw-r--r--  1 kayanazimov  staff   586B 25 Nov 18:16 server.csr
-rw-r--r--@ 1 kayanazimov  staff   357B 25 Nov 18:13 server.json
-rw-r--r--  1 kayanazimov  staff   1.2K 25 Nov 18:16 server.pem

You only needs some of these files stored by AWS ssm so etcd and later k8s cluster can read them and use.

Right, time to run some terraform and ssh to instance, once on the instance you can test server logs with journalctl and test api as well:


curl  -k https://127.0.0.1:2379/v2/keys/message -XPUT -d value="Hello"
curl  -k https://127.0.0.1:2379/v2/keys/message

etcdctl --endpoints=https://etcd-a.k8s.ifritltd.co.uk:2379  get / --prefix --keys-only

Try to write something in one node and access it in the other to test all good.

Part 3 – Setting up k8s cluster

As with etcd, we first need to setup some resources with terraform:


resource "aws_instance" "master_node1" {
  ami           = "${data.aws_ami.ubuntu_1604.id}"
  instance_type = "t2.micro"
  key_name      = "terra"
  user_data     = "${file("userdata_master_init.sh")}"

  vpc_security_group_ids = [
    "${aws_security_group.kubernetes_sg.id}",
  ]

  availability_zone    = "eu-west-2a"
  iam_instance_profile = "${aws_iam_instance_profile.aws_iam_instance_profile.name}"

  tags {
    Name = "example k8s master1"
  }
}
...

resource "aws_instance" "master_node2" {
..
resource "aws_instance" "master_node3" {
..
...


resource "aws_instance" "slave_node1" {
  ami           = "${data.aws_ami.ubuntu_1604.id}"
  instance_type = "t2.micro"
  key_name      = "terra"
  user_data     = "${file("userdata_node.sh")}"

  vpc_security_group_ids = [
    "${aws_security_group.kubernetes_sg.id}",
  ]

  availability_zone    = "eu-west-2a"
  iam_instance_profile = "${aws_iam_instance_profile.aws_iam_instance_profile.name}"

  tags {
    Name = "example k8s slave_node1"
  }
}

resource "aws_instance" "slave_node2" {
...
.....
resource "aws_instance" "slave_node3" {
....

So we spin up 3 masters and 3 nodes. In theory we can spin up as many masters and nodes we want, obviously in real prod environment we will use asg to ensure availability of each master and some cloud watch rules to scale nodes when load increases.

Then we hide the master nodes api server behind load balancer:

resource "aws_elb" "api_load_balancer" {
  name = "k8s-api"

  internal = true

  subnets = ["subnet-4e89bb03", "subnet-52ce2228", "subnet-892a81e0" ]

  instances = ["${aws_instance.master_node1.id}", "${aws_instance.master_node2.id}", "${aws_instance.master_node3.id}"]

  listener {
    instance_port     = "${var.listener_port}"
    instance_protocol = "${var.lb_protocol}"
    lb_port           = "${var.lb_port}"
    lb_protocol       = "${var.lb_protocol}"
  }

  health_check {
    interval            = "${var.hc_interval}"
    healthy_threshold   = "${var.hc_healthy_threshold}"
    unhealthy_threshold = "${var.hc_unhealthy_threshold}"
    target              = "${var.hc_target}"
    timeout             = "${var.hc_timeout}"
  }

  security_groups = ["${aws_security_group.kubernetes_sg.id}"]

  tags = [
    {
      key   = "KubernetesCluster"
      value = "kubernetes"
    },
  ]
}

data "aws_route53_zone" "k8s_zone" {
  name         = "k8s.ifritltd.co.uk."
  private_zone = true
}

resource "aws_route53_record" "kubernetes" {
  name    = "kubernetes"
  type    = "A"
  zone_id = "${data.aws_route53_zone.k8s_zone.zone_id}"

  alias {
    name                   = "${aws_elb.api_load_balancer.dns_name}"
    zone_id                = "${aws_elb.api_load_balancer.zone_id}"
    evaluate_target_health = false
  }
}

As we already created our dns zone in etcd terraform we just refer to it here to create a record for api server’s load balancer, which we will use to refer to our cluster from external world.

Don’t pay attention to VPC and subnets, for simplicity sake I just stick to defaults. For security groups please refer to next table

The most important magic happens in cloud init files. Let’s start with masters ones, that is userdata_master_init.sh and userdata_master_join.sh, why two? Because starting from N version of kubeadm init-master config is slightly different from other joiners-masters.

Right, all in order, first of all we install docker! You didn’t know it? Oh yes, k8s uses Docker underneath, not always, so other container run-times are also available:


#!/bin/bash

set -euxo pipefail

# install docker
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable"
apt-get update && apt-get install -y  docker-ce=18.06.0~ce~3-0~ubuntu

# configure docker
cat > /etc/docker/daemon.json <<EOF
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

# setup systemd docker service drop-in directory
mkdir -p /etc/systemd/system/docker.service.d

systemctl daemon-reload
systemctl restart docker
..
....

Once docker is ready, we go on and install kubectl – a tool to control our cluster, kubeadm – a tool to bootstrap the cluster, finally kubelet – part of the cluster, the node daemon, which talks to API server to find containers that should be running on its node:

...
......
systemctl restart docker

# K8S SETUP

# install required k8s tools
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat >/etc/apt/sources.list.d/kubernetes.list <<EOF
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF

# all list https://packages.cloud.google.com/apt/dists/kubernetes-xenial/main/binary-amd64/Packages
K8S_VERSION=1.12.3-00
apt-get update
apt-get install -y kubelet=${K8S_VERSION} kubeadm:amd64=${K8S_VERSION} kubectl:amd64=${K8S_VERSION} python-pip
apt-mark hold kubelet kubeadm kubectl

locale-gen en_GB.UTF-8
pip install --no-cache-dir awscli

systemctl daemon-reload
systemctl restart kubelet

As soon as kubelet is up and running, if you check it’s logs:

journalctl -u kubelet -f

You will see that kubelet is now restarting every few seconds, as it waits in a crashloop for kubeadm to tell it what to do. Before we run kubeadm, we need to configure it:

...
....
systemctl restart kubelet

mkdir -p /etc/kubernetes/pki/etcd

aws ssm get-parameters --names "etcd-ca" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/kubernetes/pki/etcd/ca.crt
aws ssm get-parameters --names "etcd-server" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/kubernetes/pki/apiserver-etcd-client.crt
aws ssm get-parameters --names "etcd-server-key" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/kubernetes/pki/apiserver-etcd-client.key

# for initial master
cat > kubeadm-config.yaml <<EOF
apiVersion: kubeadm.k8s.io/v1alpha3
kind: ClusterConfiguration
kubernetesVersion: stable
apiServerCertSANs:
- "kubernetes.k8s.ifritltd.co.uk"
controlPlaneEndpoint: "kubernetes.k8s.ifritltd.co.uk"
etcd:
    external:
        endpoints:
        - https://etcd-a.k8s.ifritltd.co.uk:2379
        - https://etcd-b.k8s.ifritltd.co.uk:2379
        - https://etcd-c.k8s.ifritltd.co.uk:2379
        caFile: /etc/kubernetes/pki/etcd/ca.crt
        certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
        keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
networking:
    podSubnet: "192.168.0.0/16"
EOF


..
.....

First we pull the PKI infra we created in etcd, then create kubeadm config file with reference to etcd nodes and network for our pod subnet.
Once we are ready we can initiate our first master node!

...
....

kubeadm init --config kubeadm-config.yaml --ignore-preflight-errors=all

..

After this step kubeadm will start configuring our cluster, that is pulling all docker images and telling kubelet what to do.
At this moment we can see docker images pulled by kubeadm:

docker images
REPOSITORY                           TAG                 IMAGE ID            CREATED             SIZE
k8s.gcr.io/kube-proxy                v1.12.2             15e9da1ca195        13 days ago         96.5MB
k8s.gcr.io/kube-apiserver            v1.12.2             51a9c329b7c5        13 days ago         194MB
k8s.gcr.io/kube-controller-manager   v1.12.2             15548c720a70        13 days ago         164MB
k8s.gcr.io/kube-scheduler            v1.12.2             d6d57c76136c        13 days ago         58.3MB
k8s.gcr.io/coredns                   1.2.2               367cdc8433a4        2 months ago        39.2MB
k8s.gcr.io/pause                     3.1                 da86e6ba6ca1        10 months ago       742kB
root@ip-172-31-19-60:~#

If you were running stacked etcd then it also would appear in this list. Let’s check further:


kubectl get pod -n kube-system
NAME                                      READY   STATUS    RESTARTS   AGE
coredns-576cbf47c7-mnd89                  1/1     Running   0          19m
coredns-576cbf47c7-sh4z8                  1/1     Running   0          19m
kube-apiserver-ip-172-31-19-60            1/1     Running   0          18m
kube-controller-manager-ip-172-31-19-60   1/1     Running   0          18m
kube-proxy-d7r49                          1/1     Running   0          19m
kube-scheduler-ip-172-31-19-60            1/1     Running   0          18m

As you can see our cluster is already running and ready to some extend.

Next is configuring calico for networking security:

# install calico
kubectl --kubeconfig=/etc/kubernetes/admin.conf apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
kubectl --kubeconfig=/etc/kubernetes/admin.conf apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml

Once this steps done, our cluster should be up and running:


kubectl get node
NAME              STATUS     ROLES    AGE     VERSION
ip-172-31-19-60   NotReady   master   8m43s   v1.12.2
root@ip-172-31-19-60:~#

Just wait until master is ready…

And then next step is to prepare some stuff for next masters:


aws ssm put-parameter --name "k8s-ca" --value "$(cat /etc/kubernetes/pki/ca.crt)"  --type "SecureString" --region eu-west-2 --overwrite 
aws ssm put-parameter --name "k8s-ca-key" --value "$(cat /etc/kubernetes/pki/ca.key)"  --type "SecureString" --region eu-west-2 --overwrite 
aws ssm put-parameter --name "k8s-sa" --value "$(cat /etc/kubernetes/pki/sa.pub)"  --type "SecureString" --region eu-west-2 --overwrite 
aws ssm put-parameter --name "k8s-sa-key" --value "$(cat /etc/kubernetes/pki/sa.key)"  --type "SecureString" --region eu-west-2 --overwrite 
aws ssm put-parameter --name "k8s-front-proxy-ca" --value "$(cat /etc/kubernetes/pki/front-proxy-ca.crt)"  --type "SecureString" --region eu-west-2 --overwrite 
aws ssm put-parameter --name "k8s-front-proxy-ca-key" --value "$(cat /etc/kubernetes/pki/front-proxy-ca.key)"  --type "SecureString" --region eu-west-2 --overwrite 

aws ssm put-parameter --name "k8s-init-token" --value "$(kubeadm token create)"  --type "SecureString" --region eu-west-2 --overwrite 
aws ssm put-parameter --name "k8s-init-token-hash" --value "$(openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \
 openssl dgst -sha256 -hex sed 's/^.* //')"  --type "SecureString" --region eu-west-2 --overwrite

Here we store all PKI created by master to AWS ssm so next master can reuse it, we also store token and hash for joiner master.

Now we can spin up another N master:

..
.....
# wait for master node
while [ "None" = "$(aws ssm get-parameters --names 'k8s-init-token' --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2)" ];do echo "waiting for init master"; sleep 5;done
 
aws ssm get-parameters --names "k8s-ca" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/kubernetes/pki/ca.crt
aws ssm get-parameters --names "k8s-ca-key" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/kubernetes/pki/ca.key
aws ssm get-parameters --names "k8s-sa" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/kubernetes/pki/sa.pub
aws ssm get-parameters --names "k8s-sa-key" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/kubernetes/pki/sa.key
aws ssm get-parameters --names "k8s-front-proxy-ca" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/kubernetes/pki/front-proxy-ca.crt
aws ssm get-parameters --names "k8s-front-proxy-ca-key" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2 > /etc/kubernetes/pki/front-proxy-ca.key

TOKEN=$(aws ssm get-parameters --names "k8s-init-token" --query '[Parameters[0].Value]' --output text --with-decryption  --region eu-west-2)
TOKEN_HASH=$(aws ssm get-parameters --names "k8s-init-token-hash" --query '[Parameters[0].Value]' --output text  --with-decryption --region eu-west-2)

kubeadm join kubernetes.k8s.ifritltd.co.uk:6443 --token $TOKEN --discovery-token-ca-cert-hash sha256:$TOKEN_HASH --experimental-control-plane

I skipped the part which is similar to master-initiator, docker install etc, significant part is joiner master is waiting until token is created by first alpha master, so it can join the existing cluster with kubeadm join command.

Finally when you masters are running, you can join any number of slaves to the cluster:

....
kubeadm join kubernetes.k8s.ifritltd.co.uk:6443 --token $TOKEN --discovery-token-ca-cert-hash sha256:$TOKEN_HASH

The cloud init is just like for master-joiner, but kubeadm init without –experimental-control-plane parameter.

Lets get nodes:

oot@ip-172-31-23-145:~# kubectl get nodes
NAME               STATUS   ROLES    AGE     VERSION
ip-172-31-23-145   Ready    master   10m     v1.12.2
ip-172-31-23-146   Ready    master   17m     v1.12.2
ip-172-31-29-77    Ready    <none>   7m20s   v1.12.2
root@ip-172-31-23-145:~#

You should see something similar, then consequently more nodes will just join the cluster as you spin them up. Once we finish our bootstrapping process, we should have 9 nodes in total, 3 per zone for etcd, 3 for masters and finally 3 for slaves. 9 nodes in total.

While the actual code to bootstrap the cluster may not be quite complete for production use, the approach I used is satisfactory in my opinion for automating the bootstrapping process of HA Kubernetes cluster with external etcd.

Now that you are familiar with bootstrapping process, you can clone and spin up the cluster using my git repo:

HA etcd: https://github.com/kenych/terraform_exs/tree/master/etcd
HA k8s: https://github.com/kenych/terraform_exs/tree/master/k8s_ha

Please note I am using the next versions:

K8S_VERSION=1.12.3-00
ETCD_VERSION=v3.3.8

Normally pinned version means stable code, so I really hope it is just going to work next time you run it. The only manual step is to create and store initial PKI infra which is very straightforward anyway.