Terraform scripts to create a K8s Cluster using “kubeadm” in AWS from scratch

Rangaswamy P V
12 min readMar 28, 2023

--

Here in this article we will create a custom Kubernetes cluster on the AWS Cloud and the following are the pre-requisites

  1. Secret keys of your AWS account
  2. Access keys of your AWS account
  3. and also your Private/Public ssh key in the Openssl format

Once you have all of the above , then you can download the terraform script in my github repo. If you do not have terraform installed in the box then you can get it installed by running the bash script(installtf.sh) which is in the same repo. Lets go ahead and download the repo content

$ git pull https://github.com/rangapv/tf.git
$ ls
$ README.md ec2-simplek8s ec2-vanilla eks-aws installtf.sh
$ ./installtf.sh

When you run the install.sh as show above you will be prompted for the version of terraform and it will do the necessary configuration and terraform is ready for you to be accessed. The install will also set the Terraform log path to the current working directory from which the “terraform init” gets triggered and the file is named as “tf.log”.

The k8s Cluster that we are creating will run on an “Ubuntu 22.04” AMI, the script and modules will create all the necessary AWS infrastructure like the VPCs, Subnets, Security Groups, Internet Gateways, In-bound and out-bound rules etc…

The modules can be found in my Github repo. Change to the “ec2-simplek8s” sub-directory from the above git pull command on the repo link. The contents of which is as shown below.

$ git pull https://github.com/rangapv/tf.git
$ cd ec2-simplek8s
$ ls -l
-rw-rw-r-- 1 ubuntu ubuntu 343 Mar 27 15:01 README.md
-rw-rw-r-- 1 ubuntu ubuntu 1769 Mar 27 15:01 ec2.tf
-rw-rw-r-- 1 ubuntu ubuntu 183 Mar 27 15:01 modify.secret.tfvars
-rw-rw-r-- 1 ubuntu ubuntu 3191 Mar 27 15:01 policy1.json
-rw-rw-r-- 1 ubuntu ubuntu 604 Mar 27 15:01 policy2.json
-rw-rw-r-- 1 ubuntu ubuntu 237 Mar 27 15:01 policy6.json
-rwxrwxr-x 1 ubuntu ubuntu 5211 Mar 27 15:01 remote.sh
-rw-rw-r-- 1 ubuntu ubuntu 711 Mar 27 15:01 role.tf
-rw-rw-r-- 1 ubuntu ubuntu 500 Mar 27 15:01 terraform.tf
-rw-rw-r-- 1 ubuntu ubuntu 706 Mar 27 15:01 user1.sh
-rw-rw-r-- 1 ubuntu ubuntu 645 Mar 27 15:01 variables.tf
-rw-rw-r-- 1 ubuntu ubuntu 2706 Mar 27 15:01 vpc.tf

The terraform modules do not create ssh key pairs as of now. Hence one needs to create Key pairs in aws console and download the key.pem file. I have done mine and named the keypair as “Aldo3.pem”. More info in the Addendum1: To create the key.pem file for Terraform the link to which is also in the bottom of this article.

Let us look at file in the “ec2-simplek8s” directory one by one….

The “ec2.tf” is the root module which creates your AWS ec2 instance taking the inputs like the ami-ID, cpu config, security group, storage etc…

#create ec2 instance

provider "aws" {
region = var.region
access_key = var.accesskey
secret_key = var.secretkey
}

resource "aws_instance" "app_server" {
depends_on = [
aws_subnet.publicsubnetstf,
aws_security_group.vpc_security_tf
]
count = length(var.server_names)
#name = var.server_names[count.index]

ami = var.ami
instance_type = var.ins_type
associate_public_ip_address = "true"
iam_instance_profile = aws_iam_instance_profile.k8s_profile.id
#cpu_core_count = var.cpu_core
key_name = var.key_name
# security_groups = [ var.sec_name ]
user_data = "${file("./user1.sh")}"
#volume_size = var.vol_size
subnet_id = aws_subnet.publicsubnetstf.id
root_block_device {
volume_size = var.vol_size
}
vpc_security_group_ids = [ aws_security_group.vpc_security_tf.id, ]
tags = {
Name = var.server_names[count.index]
}

connection {
type = "ssh"
user = "ubuntu"
host = self.public_ip
private_key = file("${path.module}/${var.keypath}")
}

provisioner "file" {
source = "./remote.sh"
destination = "/home/ubuntu/remote.sh"
}

provisioner "file" {
source = "./${var.keypath}"
destination = "./${var.keypath}"
}

provisioner "remote-exec" {
inline = [
"chmod +x /home/ubuntu/remote.sh",
"/home/ubuntu/remote.sh ${var.accesskey} ${var.secretkey} ${var.server_names[0]} ${var.server_names[count.index]} ${var.keypath}",
]
# on_failure = continue
}

}

output "instances" {
value = "${aws_instance.app_server.*.public_ip}"
description = "PrivateIP address details"
}

output "file" {
value = fileexists("${path.module}/${var.keypath}")
description = "To check if file is there"
}

As you can see from the above file everything in the file is dynamically created, you just need to modify the file secrets.tfvars file in the directory to match your AWS account details and you are good to go . The sample modify.secrets.tfvars file is as show below. Modify- this and make changes to the content of the file with your AWS specific values. Rename it as “secrets.tfvars” . This file should not be uploaded to the github repo as it has sensitive info regarding your AWS account.

The file has the following values for reference and it awaits your AWS specific values for the accesskey, secretkey and keypath (the name of the Private-Key file in .pem format we talked about earlier — refer Addendum1: To create the key.pem file for Terraform). And maybe your specific Ubuntu22.04 AMI id. The rest of the file remains as is and you are good to go.

accesskey = "AKIxxxxxxxxxYJQ"
secretkey = "UpvXxxxxxxxxxxxxxxxx1bmxo"
ami = "ami-0735c191cf914754d"
key_name = "AldoCloudKEY"
public_subnets = "172.1.0.0/16"
public_snets = "0.0.0.0/0"
keypath = "Aldo3.pem"

The “variables.tf” is the variables file that sets values in the “ec2.tf” file

variable "region" {
default = "us-west-2"
}

variable "secretkey" {
type = string
sensitive = true
}
..
..
variable "ami" {
type = string
}
variable "ins_type" {
default = "t2.large"
}
..
..
..
variable "public_snets" {
type = string
}
variable "public_subnets" {
type = string
}
variable "vol_size" {
default = 10
}

You can have values directly encoded in the “variables.tf” file. Alternatively you can just define the variables here and put the values in “secrets.tfvars” file. In my case the values that are encoded in variables file does not change for my AWS account. And the values in the “secrets.tfvars” can change for my other AWS accounts. Hence I kind of put some values in one file(varaibles.tf) and other values in the other file(secrets.tfvars).

The “terraform.tf” file is as show below. Which is basically the version info of modules that needs to be downloaded for the script to work.

terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.57.1"
}

random = {
source = "hashicorp/random"
version = "~> 3.4.3"
}

tls = {
source = "hashicorp/tls"
version = "~> 4.0.4"
}

cloudinit = {
source = "hashicorp/cloudinit"
version = "~> 2.2.0"
}

kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.16.1"
}

}

required_version = "~> 1.3"
}

The “vpc.tf” is the file which will create the AWS infrastructure like the Subnets, RoutTables, Internet Gateway and the VPC itself.

# CrateVPC,Subnets,Internet Gateways, Security Group

resource "aws_vpc" "awstfvpc" {
cidr_block = "${var.public_subnets}"

tags = {
Name = "aws-tf-vpc"
}
}

resource "aws_internet_gateway" "IGW1" { # Creating Internet Gateway
vpc_id = aws_vpc.awstfvpc.id # vpc_id will be generated after we create VPC
}


resource "aws_subnet" "publicsubnetstf" { # Creating Public Subnets
vpc_id = aws_vpc.awstfvpc.id
cidr_block = "${var.public_subnets}" # CIDR block of public subnets
}


resource "aws_route_table" "PublicRT" { # Creating RT for Public Subnet
vpc_id = aws_vpc.awstfvpc.id
route {
cidr_block = "0.0.0.0/0" # Traffic from Public Subnet reaches Internet via Internet Gateway
gateway_id = aws_internet_gateway.IGW1.id
}
}

resource "aws_route_table_association" "PublicRTassociation" {
subnet_id = aws_subnet.publicsubnetstf.id
route_table_id = aws_route_table.PublicRT.id
}

resource "aws_security_group" "vpc_security_tf" {
name = "aws_tf_simplek8s"
description = "Secuirty Group for tf Instances inbound traffic"
vpc_id = aws_vpc.awstfvpc.id

tags = {
Name = "k8s_secuirty_tf"
}
}

resource "aws_vpc_security_group_ingress_rule" "http" {
security_group_id = aws_security_group.vpc_security_tf.id
cidr_ipv4 = var.public_snets
from_port = 80
ip_protocol = "tcp"
to_port = 80
}
resource "aws_vpc_security_group_ingress_rule" "https" {
security_group_id = aws_security_group.vpc_security_tf.id

cidr_ipv4 = var.public_snets
from_port = 443
ip_protocol = "tcp"
to_port = 443
}
resource "aws_vpc_security_group_ingress_rule" "ssh" {
security_group_id = aws_security_group.vpc_security_tf.id

cidr_ipv4 = var.public_snets
from_port = 22
ip_protocol = "tcp"
to_port = 22
}
resource "aws_vpc_security_group_ingress_rule" "dash" {
security_group_id = aws_security_group.vpc_security_tf.id

cidr_ipv4 = var.public_snets
from_port = 30002
ip_protocol = "tcp"
to_port = 30002
}
resource "aws_vpc_security_group_ingress_rule" "k8srule" {
security_group_id = aws_security_group.vpc_security_tf.id

cidr_ipv4 = var.public_snets
from_port = 6443
ip_protocol = "tcp"
to_port = 6443
}

resource "aws_vpc_security_group_ingress_rule" "k8sdashrule" {
security_group_id = aws_security_group.vpc_security_tf.id

cidr_ipv4 = var.public_snets
from_port = 8443
ip_protocol = "tcp"
to_port = 8443
}
resource "aws_vpc_security_group_egress_rule" "egre_rule" {
security_group_id = aws_security_group.vpc_security_tf.id

cidr_ipv4 = var.public_snets
ip_protocol = "-1"
}

Then you have the “role.tf” file. This creates the necessary IAM_policy in AWS and also creates a AWS role with the right entity permissions so that when you create a Cloud Controller Manager or a Statefull set for cloud based storage then you have all the necessary permissions to call and execute the AWS APIs.

#define Roles and Policies to be associated with the ec2-instance

resource "aws_iam_policy" "policy1" {
# path = "/users/rangapv/"
policy = "${file("./policy1.json")}"

}

resource "aws_iam_policy" "policy2" {
# path = "/users/rangapv/"
policy = "${file("./policy2.json")}"

}
resource "aws_iam_role" "k8s_role" {
name = "k8s_role"
# path = "/users/rangapv/"
# assume_role_policy = aws_iam_policy.policy1,
assume_role_policy = "${file("./policy6.json")}"
managed_policy_arns = [ aws_iam_policy.policy1.arn, aws_iam_policy.policy2.arn ]
tags = {
tag-key = "k8s-tf-role"
}
}

resource "aws_iam_instance_profile" "k8s_profile" {
name = "k8s_profile"
role = aws_iam_role.k8s_role.name
}

The files “policy1.json” and “policy2.json” are the specific IAM policies. Which are in the repo(ec2-simplek8s). The file “policy6.json” are the specific Assume Role policy. These are generic to AWS and can be used AS-IS without modification on your AWS account. As long as the user with access & secret keys have the RIGHT access permissions for the AWS account.

The “remote.sh” file is the Centre of action where it creates our control plane and data plane nodes. The variable “server_names” defined in the “varaible.tf” file will have this value. The first name of the list is the Master Node and the rest of the names will be treated as Worker node. In my case the first value “k8smaster11” in the variable “server_names” refers to the Master Node which will have all the control plane modules and the “k8sworker11” will be treated as the worker node and the corresponding worker k8s components will be installed.

variable "server_names" {

default = [ "k8smaster11", "k8sworker11" ]

}

If you interchange the names for the sake of it in the variables file and run the apply then it will take k8sworker11 as the Master node and the other as the Worker node. Just saying.

Note: You can create any number of nodes as you may desire. The first instances will be always be treated as the Master and the rest of the instances as the worker nodes. You just need to update the file “variables.tf” file with the new worker nodes names like the sample one shown below.

variable "server_names" {
default = [ "k8smaster11", "k8sworker11", "k8sworker12 ]
}

The “remote.sh” is the core script which uses the “kubeadm” to install all the k8s components for the nodes. This file should not be modified unless you know what you need. For vanilla k8s install with cloud controller manager it will work out of the box.

The “user1.sh” is the custom input data that is required by the ubuntu 22.04 for enabling the ssh daemon with default settings since(ssh) does not come enabled by default in the AWS AMI.

Let us run the terraform script…

$ terraform init
terraform initInitializing the backend...

Initializing provider plugins...
- Reusing previous version of hashicorp/aws from the dependency lock file
- Reusing previous version of hashicorp/random from the dependency lock file
- Reusing previous version of hashicorp/tls from the dependency lock file
- Reusing previous version of hashicorp/cloudinit from the dependency lock file
- Reusing previous version of hashicorp/kubernetes from the dependency lock file
- Using previously-installed hashicorp/aws v4.57.1
- Using previously-installed hashicorp/random v3.4.3
- Using previously-installed hashicorp/tls v4.0.4
- Using previously-installed hashicorp/cloudinit v2.2.0
- Using previously-installed hashicorp/kubernetes v2.16.1

Terraform has been successfully initialized!


$ terraform plan -var-file="secret.tfvars"

aws_vpc.awstfvpc: Creating...
aws_vpc.awstfvpc: Creation complete after 1s [id=vpc-03fc8b85dad50491f]
aws_security_group.vpc_security_tf: Creating...
aws_internet_gateway.IGW1: Creating...
aws_subnet.publicsubnetstf: Creating...
aws_internet_gateway.IGW1: Creation complete after 0s [id=igw-08c1e8912486d2e0d]
aws_route_table.PublicRT: Creating...
aws_subnet.publicsubnetstf: Creation complete after 0s [id=subnet-02d3a45c0ee421a6b]
aws_route_table.PublicRT: Creation complete after 1s [id=rtb-0c1b88f770000a84e]
aws_route_table_association.PublicRTassociation: Creating...
aws_route_table_association.PublicRTassociation: Creation complete after 0s [id=rtbassoc-03457f48e25a83a93]
aws_security_group.vpc_security_tf: Creation complete after 1s [id=sg-0c78a158f2593976b]
aws_instance.app_server[1]: Creating...
aws_vpc_security_group_ingress_rule.https: Creating...
aws_instance.app_server[2]: Creating...
aws_vpc_security_group_ingress_rule.ssh: Creating...
aws_vpc_security_group_ingress_rule.k8srule: Creating...
aws_vpc_security_group_ingress_rule.all: Creating...
aws_vpc_security_group_ingress_rule.http: Creating...
aws_vpc_security_group_egress_rule.egre_rule: Creating...
aws_instance.app_server[0]: Creating...
aws_vpc_security_group_ingress_rule.all: Creation complete after 0s [id=sgr-0f19d242a4c2c0aa6]
aws_vpc_security_group_ingress_rule.ssh: Creation complete after 0s [id=sgr-0d418d6680d004559]
aws_vpc_security_group_ingress_rule.http: Creation complete after 0s [id=sgr-087f9d53532e18fc1]
aws_vpc_security_group_ingress_rule.https: Creation complete after 0s [id=sgr-0b7562e8e61547e52]
aws_vpc_security_group_egress_rule.egre_rule: Creation complete after 0s [id=sgr-004a328580e2df401]
aws_vpc_security_group_ingress_rule.k8srule: Creation complete after 0s [id=sgr-022e7f6a40436a649]
....
....
....
Plan: 2 to add, 0 to change, 0 to destroy.

Changes to Outputs:
~ instances = [
+ (known after apply),
+ (known after apply),
]

Note how I am applying “secrets.tfvars” as command line argument when running my “plan” and “apply” commands in terraform.

Let us do an “apply” …

$ terraform apply -var-file="secret.tfvars"

OUTPUT:

aws_instance.app_server[1]: Creating...
aws_instance.app_server[0]: Creating...
aws_instance.app_server[1]: Still creating... [10s elapsed]
aws_instance.app_server[0]: Still creating... [10s elapsed]
aws_instance.app_server[0]: Still creating... [20s elapsed]
aws_instance.app_server[1]: Provisioning with 'file'...
aws_instance.app_server[0]: Provisioning with 'file'...
aws_instance.app_server[1]: Still creating... [30s elapsed]
aws_instance.app_server[0]: Still creating... [30s elapsed]
aws_instance.app_server[1]: Provisioning with 'file'...
aws_instance.app_server[1]: Provisioning with 'remote-exec'...
aws_instance.app_server[1] (remote-exec): (output suppressed due to sensitive value in config)
aws_instance.app_server[1] (remote-exec): (output suppressed due to sensitive value in config)
aws_instance.app_server[0]: Provisioning with 'file'...
aws_instance.app_server[0]: Provisioning with 'remote-exec'...
aws_instance.app_server[0] (remote-exec): (output suppressed due to sensitive value in config)
aws_instance.app_server[0] (remote-exec): (output suppressed due to sensitive value in config)
aws_instance.app_server[1]: Still creating... [40s elapsed]
aws_instance.app_server[0]: Still creating... [40s elapsed]
..
..
..
aws_instance.app_server[1] (remote-exec): (output suppressed due to sensitive value in config)
aws_instance.app_server[1]: Still creating... [1m50s elapsed]
aws_instance.app_server[0]: Still creating... [1m50s elapsed]
aws_instance.app_server[1] (remote-exec): (output suppressed due to sensitive value in config)
aws_instance.app_server[1] (remote-exec): (output suppressed due to sensitive value in config)
..
..
..
aws_instance.app_server[0]: Still creating... [5m20s elapsed]
aws_instance.app_server[1]: Still creating... [5m30s elapsed]
aws_instance.app_server[0]: Still creating... [5m30s elapsed]
..
..
..
aws_instance.app_server[0]: Still creating... [6m20s elapsed]
aws_instance.app_server[1]: Creation complete after 6m25s [id=i-084b839bc6f0feded]
aws_instance.app_server[0]: Still creating... [6m30s elapsed]
aws_instance.app_server[0] (remote-exec): (output suppressed due to sensitive value in config)
aws_instance.app_server[0] (remote-exec): (output suppressed due to sensitive value in config)
aws_instance.app_server[0]: Creation complete after 6m34s [id=i-0cd96f59a724ef9a2]
..
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

Outputs:

file = true
instances = [
"34.220.56.183",
"35.92.117.43",
]

We have got the public-IPs of the AWS instance that we just created and it will have a tag of the Server Names we defined. Log into your AWS console to check it.

The dashboard pod is also installed and one can access the Dashboard from a URL. To access the Web Dashboard let us generate the bearer token as show below. The dashboard pod can run in any of the nodes, hence make sure they are in the right node by probing “kubectl describe pod kuberentes-dashboard -n kubernetes-dashboard” and grab that Public-IP of the node to get bearer token like shown below

Generation of the bearer token: On the box where the dashboard software is installed ; go to the unix prompt(use any ssh tool with the key.pem file we generated earlier in this article) enter the following command to generate the bearer token .

$ kubectl -n kubernetes-dashboard create token admin-user
eyJhbGciOiJSUzI1NiIsImtpZCI6ImlSN1R0REN3WnYzMj........................
......................................................................
.......................................................ixAw5La96Z3DVhoeQg

Then on a Browser type https://public-IP:30002 , you will be prompted to enter the above token and BINGO! you get the Dashboard.

After the creation you can login to the boxes and check the installs. On any of the nodes …

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-1-198-125.us-west-2.compute.internal Ready control-plane 3m40s v1.26.3
ip-172-1-223-66.us-west-2.compute.internal Ready <none> 101s v1.26.3

$ kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system cloud-controller-manager-nf6rb 1/1 Running 0 2m7s
kube-system coredns-787d4945fb-b4qmn 1/1 Running 0 3m23s
kube-system coredns-787d4945fb-wr2xq 1/1 Running 0 3m23s
kube-system etcd-ip-172-1-198-125.us-west-2.compute.internal 1/1 Running 0 4m14s
kube-system kube-apiserver-ip-172-1-198-125.us-west-2.compute.internal 1/1 Running 0 3m55s
kube-system kube-controller-manager-ip-172-1-198-125.us-west-2.compute.internal 1/1 Running 0 3m56s
kube-system kube-flannel-ds-4z9lc 1/1 Running 0 2m16s
kube-system kube-flannel-ds-78nmq 1/1 Running 0 3m23s
kube-system kube-proxy-n4m6j 1/1 Running 0 3m23s
kube-system kube-proxy-wj65l 1/1 Running 0 2m16s
kube-system kube-scheduler-ip-172-1-198-125.us-west-2.compute.internal 1/1 Running 0 4m11s
kubernetes-dashboard dashboard-metrics-scraper-cdb6ffdf5-n6b9z 1/1 Running 0 3m23s
kubernetes-dashboard kubernetes-dashboard-5796d948b8-gcnmd 1/1 Running 0 3m23s

There is no intervention needed anywhere and you have a Working Kubernetes cluster with cloud controller manager installed. Now you can go ahead and deploy your apps either using helm and other k8s add on.

If you have any issues , you can open issue in the github repo page or alternatively contact me at rangapv@yahoo.com or find me on Twitter @rangapv

Addendum1: To create the key.pem file for Terraform ..

--

--

Rangaswamy P V
Rangaswamy P V

Written by Rangaswamy P V

Works on Devops in Startups, reach me @rangapv on X/twitter or email: rangapv@gmail.com

No responses yet