Since EKS is pretty new, there aren’t a lot of howtos on it yet.
I wanted to follow along with Amazon’s Getting started with EKS & Kubernetes Guide.
However I didn’t want to use cloudformation. We all know Terraform is far superior!
Join 38,000 others and follow Sean Hull on twitter @hullsean.
With that I went to work getting it going. And a learned a few lessons along the way.
My steps follow pretty closely with the Amazon guide above, and setting up the guestbook app. The only big difference is I’m using Terraform.
1. create the EKS service role
Create a file called eks-iam-role.tf and add the following:
resource "aws_iam_role" "demo-cluster" { name = "terraform-eks-demo-cluster" assume_role_policy = --POLICY { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "eks.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } POLICY } resource "aws_iam_role_policy_attachment" "demo-cluster-AmazonEKSClusterPolicy" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy" role = "${aws_iam_role.demo-cluster.name}" } resource "aws_iam_role_policy_attachment" "demo-cluster-AmazonEKSServicePolicy" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKSServicePolicy" role = "${aws_iam_role.demo-cluster.name}" }
Note we reference demo-cluster resource. We define that in step #3 below.
Related: How to setup Amazon ECS with Terraform
2. Create the EKS vpc
Here’s the code to create the VPC. I’m using the Terraform community module to do this.
There are two things to notice here. One is I reference eks-region variable. Add this in your vars.tf. “us-east-1” or whatever you like. Also add cluster-name to your vars.tf.
Also notice the special tags. Those are super important. If you don’t tag your resources properly, kubernetes won’t be able to do it’s thing. Or rather EKS won’t. I had this problem early on and it is very hard to diagnose. The tags in this vpc module, with propagate to subnets, and security groups which is also crucial.
# provider "aws" { region = "${var.eks-region}" } # module "eks-vpc" { source = "terraform-aws-modules/vpc/aws" name = "eks-vpc" cidr = "10.0.0.0/16" azs = "${var.eks-azs}" private_subnets = "${var.eks-private-cidrs}" public_subnets = "${var.eks-public-cidrs}" enable_nat_gateway = false single_nat_gateway = true # reuse_nat_ips = "${var.eks-reuse-eip}" enable_vpn_gateway = false # external_nat_ip_ids = ["${var.eks-nat-fixed-eip}"] enable_dns_hostnames = true tags = { Terraform = "true" Environment = "${var.environment_name}" "kubernetes.io/cluster/${var.cluster-name}" = "shared" } } resource "aws_security_group_rule" "allow_http" { type = "ingress" from_port = 80 to_port = 80 protocol = "TCP" security_group_id = "${module.eks-vpc.default_security_group_id}" cidr_blocks = ["0.0.0.0/0"] } resource "aws_security_group_rule" "allow_guestbook" { type = "ingress" from_port = 3000 to_port = 3000 protocol = "TCP" security_group_id = "${module.eks-vpc.default_security_group_id}" cidr_blocks = ["0.0.0.0/0"] }
Related: How I resolved some tough Docker problems when i was troubleshooting amazon ECS
3. Create the EKS Cluster
Creating the cluster is a short bit of terraform code below. The aws_eks_cluster resource.
# # main EKS terraform resource definition # resource "aws_eks_cluster" "eks-cluster" { name = "${var.cluster-name}" role_arn = "${aws_iam_role.demo-cluster.arn}" vpc_config { subnet_ids = ["${module.eks-vpc.public_subnets}"] } } output "endpoint" { value = "${aws_eks_cluster.eks-cluster.endpoint}" } output "kubeconfig-certificate-authority-data" { value = "${aws_eks_cluster.eks-cluster.certificate_authority.0.data}" }
Related: Is Amazon too big to fail?
4. Install & configure kubectl
The AWS docs are pretty good on this point.
First you need to install the client on your local desktop. For me i used brew install, the mac osx package manager. You’ll also need the heptio-authenticator-aws binary. Again refer to the aws docs for help on this.
The main piece you will add is a directory (~/.kube) and edit this file ~/.kube/config as follows:
apiVersion: v1 clusters: - cluster: server: https://3A3C22EEF7477792E917CB0118DD3X22.yl4.us-east-1.eks.amazonaws.com certificate-authority-data: "a-really-really-long-string-of-characters" name: kubernetes contexts: - context: cluster: kubernetes user: aws name: aws current-context: aws kind: Config preferences: {} users: - name: aws user: exec: apiVersion: client.authentication.k8s.io/v1alpha1 command: heptio-authenticator-aws args: - "token" - "-i" - "sean-eks" # - "-r" # - "arn:aws:iam::12345678901:role/sean-eks-role" #env: # - name: AWS_PROFILE # value: "seancli"%
Related: Is AWS too complex for small dev teams?
5. Spinup the worker nodes
This is definitely the largest file in your terraform EKS code. Let me walk you through it a bit.
First we attach some policies to our role. These are all essential to EKS. They’re predefined but you need to group them together.
Then you need to create a security group for your worker nodes. Notice this also has the special kubernetes tag added. Be sure that it there or you’ll have problems.
Then we add some additional ingress rules, which allow workers & the control plane of kubernetes all to communicate with eachother.
Next you’ll see some serious user-data code. This handles all the startup action, on the worker node instances. Notice we reference some variables here, so be sure those are defined.
Lastly we create a launch configuration, and autoscaling group. Notice we give it the AMI as defined in the aws docs. These are EKS optimized images, with all the supporting software. Notice also they are only available currently in us-east-1 and us-west-1.
Notice also that the autoscaling group also has the special kubernetes tag. As I’ve been saying over and over, that super important.
# # EKS Worker Nodes Resources # * IAM role allowing Kubernetes actions to access other AWS services # * EC2 Security Group to allow networking traffic # * Data source to fetch latest EKS worker AMI # * AutoScaling Launch Configuration to configure worker instances # * AutoScaling Group to launch worker instances # resource "aws_iam_role" "demo-node" { name = "terraform-eks-demo-node" assume_role_policy = --POLICY { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } POLICY } resource "aws_iam_role_policy_attachment" "demo-node-AmazonEKSWorkerNodePolicy" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy" role = "${aws_iam_role.demo-node.name}" } resource "aws_iam_role_policy_attachment" "demo-node-AmazonEKS_CNI_Policy" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy" role = "${aws_iam_role.demo-node.name}" } resource "aws_iam_role_policy_attachment" "demo-node-AmazonEC2ContainerRegistryReadOnly" { policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly" role = "${aws_iam_role.demo-node.name}" } resource "aws_iam_role_policy_attachment" "demo-node-lb" { policy_arn = "arn:aws:iam::12345678901:policy/eks-lb-policy" role = "${aws_iam_role.demo-node.name}" } resource "aws_iam_instance_profile" "demo-node" { name = "terraform-eks-demo" role = "${aws_iam_role.demo-node.name}" } resource "aws_security_group" "demo-node" { name = "terraform-eks-demo-node" description = "Security group for all nodes in the cluster" # vpc_id = "${aws_vpc.demo.id}" vpc_id = "${module.eks-vpc.vpc_id}" egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } tags = "${ map( "Name", "terraform-eks-demo-node", "kubernetes.io/cluster/${var.cluster-name}", "owned", ) }" } resource "aws_security_group_rule" "demo-node-ingress-self" { description = "Allow node to communicate with each other" from_port = 0 protocol = "-1" security_group_id = "${aws_security_group.demo-node.id}" source_security_group_id = "${aws_security_group.demo-node.id}" to_port = 65535 type = "ingress" } resource "aws_security_group_rule" "demo-node-ingress-cluster" { description = "Allow worker Kubelets and pods to receive communication from the cluster control plane" from_port = 1025 protocol = "tcp" security_group_id = "${aws_security_group.demo-node.id}" source_security_group_id = "${module.eks-vpc.default_security_group_id}" to_port = 65535 type = "ingress" } data "aws_ami" "eks-worker" { filter { name = "name" values = ["eks-worker-*"] } most_recent = true owners = ["602401143452"] # Amazon } # EKS currently documents this required userdata for EKS worker nodes to # properly configure Kubernetes applications on the EC2 instance. # We utilize a Terraform local here to simplify Base64 encoding this # information into the AutoScaling Launch Configuration. # More information: https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-06-05/amazon-eks-nodegroup.yaml locals { demo-node-userdata = --USERDATA #!/bin/bash -xe CA_CERTIFICATE_DIRECTORY=/etc/kubernetes/pki CA_CERTIFICATE_FILE_PATH=$CA_CERTIFICATE_DIRECTORY/ca.crt mkdir -p $CA_CERTIFICATE_DIRECTORY echo "${aws_eks_cluster.eks-cluster.certificate_authority.0.data}" | base64 -d > $CA_CERTIFICATE_FILE_PATH INTERNAL_IP=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4) sed -i s,MASTER_ENDPOINT,${aws_eks_cluster.eks-cluster.endpoint},g /var/lib/kubelet/kubeconfig sed -i s,CLUSTER_NAME,${var.cluster-name},g /var/lib/kubelet/kubeconfig sed -i s,REGION,${var.eks-region},g /etc/systemd/system/kubelet.service sed -i s,MAX_PODS,20,g /etc/systemd/system/kubelet.service sed -i s,MASTER_ENDPOINT,${aws_eks_cluster.eks-cluster.endpoint},g /etc/systemd/system/kubelet.service sed -i s,INTERNAL_IP,$INTERNAL_IP,g /etc/systemd/system/kubelet.service DNS_CLUSTER_IP=10.100.0.10 if [[ $INTERNAL_IP == 10.* ]] ; then DNS_CLUSTER_IP=172.20.0.10; fi sed -i s,DNS_CLUSTER_IP,$DNS_CLUSTER_IP,g /etc/systemd/system/kubelet.service sed -i s,CERTIFICATE_AUTHORITY_FILE,$CA_CERTIFICATE_FILE_PATH,g /var/lib/kubelet/kubeconfig sed -i s,CLIENT_CA_FILE,$CA_CERTIFICATE_FILE_PATH,g /etc/systemd/system/kubelet.service systemctl daemon-reload systemctl restart kubelet USERDATA } resource "aws_launch_configuration" "demo" { associate_public_ip_address = true iam_instance_profile = "${aws_iam_instance_profile.demo-node.name}" image_id = "${data.aws_ami.eks-worker.id}" instance_type = "m4.large" name_prefix = "terraform-eks-demo" security_groups = ["${aws_security_group.demo-node.id}"] user_data_base64 = "${base64encode(local.demo-node-userdata)}" lifecycle { create_before_destroy = true } } resource "aws_autoscaling_group" "demo" { desired_capacity = 2 launch_configuration = "${aws_launch_configuration.demo.id}" max_size = 2 min_size = 1 name = "terraform-eks-demo" # vpc_zone_identifier = ["${aws_subnet.demo.*.id}"] vpc_zone_identifier = ["${module.eks-vpc.public_subnets}"] tag { key = "Name" value = "eks-worker-node" propagate_at_launch = true } tag { key = "kubernetes.io/cluster/${var.cluster-name}" value = "owned" propagate_at_launch = true } }
Related: How to hire a developer that doesn’t suck
6. Enable & Test worker nodes
If you haven’t already done so, apply all your above terraform:
$ terraform init $ terraform plan $ terraform apply
After that all runs, and all your resources are created. Now edit the file “aws-auth-cm.yaml” with the following contents:
apiVersion: v1 kind: ConfigMap metadata: name: aws-auth namespace: kube-system data: mapRoles: | - rolearn: arn:aws:iam::12345678901:role/terraform-eks-demo-node username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes%
Then apply it to your cluster:
$ kubectl apply -f aws-auth-cm.yaml
you should be able to use kubectl to view node status:
$ kubectl get nodes NAME STATUS ROLES AGE VERSION ip-10-0-101-189.ec2.internal Ready 10d v1.10.3 ip-10-0-102-182.ec2.internal Ready 10d v1.10.3 $
Related: Why would I help a customer that’s not paying?
7. Setup guestbook app
Finally you can follow the exact steps in the AWS docs to create the app. Here they are again:
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/redis-master-controller.json $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/redis-master-service.json $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/redis-slave-controller.json $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/redis-slave-service.json $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/guestbook-controller.json $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/guestbook-service.json
Then you can get the endpoint with kubectl:
$ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE guestbook LoadBalancer 172.20.177.126 aaaaa555ee87c... 3000:31710/TCP 4d kubernetes ClusterIP 172.20.0.1 443/TCP 10d redis-master ClusterIP 172.20.242.65 6379/TCP 4d redis-slave ClusterIP 172.20.163.1 6379/TCP 4d $
Use “kubectl get services -o wide” to see the entire EXTERNAL-IP. If that is saying you likely have an issue with your node iam role, or missing special kubernetes tags. So check on those. It shouldn’t show for more than a minute really.
Hope you got everything working.
Good luck and if you have questions, post them in the comments & I’ll try to help out!
Related: How to migrate my skills to the cloud?