TABLE OF CONTENT TABLE OF CONTENT:
- Introduction to Kubernetes
- Types of Application Deployment
- Features of kubernetes
- Docker Swarm v/s Kubernetes
- Kubernetes Cluster
- Master Node or Control Panel Components
- Worker Node Components
- Install Kubernetes cluster
- Initialize Kubernetes cluster
- Install a Pod network on the cluster
- Customizing the Control Panel
- Kubernetes namespaces
- Highly Available Cluster setup
Introduction to Kubernetes
Kubernetes is an open source container orchestration engine for automating deployments, scaling and managing the containers applications. Kubernetes is an open source Google based tool.
Kubernetes is more than just management of containers like it keeps the load balanced between the cluster nodes, provides a self-healing mechanism, service discovery, provides a container runtime, zero downtime deployment capabilities, automatic rollback, etc.
Types of Application deployment’s
Physical server:Apps which were running on Physical server had issues related to resource allocations like if you have 5 apps running on physical server and 1 app consumes lot of CPU or memory than other 4 apps use to suffer. In order to fix the issue you would require more physical servers. But this approach is too expensive.
Virtualization'sIn Virtualization you can isolate applications. You can have different VM’s from a particular hardware and then deploy OS and applications on top of it. It allows better utilization of resources and scale easily and saves hardware cost. This technology is still being utilized by lot of companies now.
Container deployment:Coming from Virtualizations era containerizations follows light weighted and portable deployments where containers share same OS, CPU, memory , have their own file systems and that’s why deployments work in same way such as local machines or on cloud infrastructure.
Features of kubernetes
Kubernetes is more than just managing containers because of the features it supports such as the load-balanced between the cluster nodes, provides a self-healing container mechanism, service discovery, provides a container runtime, zero downtime deployment capabilities, automatic rollback, etc.
It has ability to scale when needed that is known as AutoScaling. You can manage configurations like secrets or passwords and also mount EFS or other storage when required automatically.
Docker Swarm v/s Kubernetes
|Use of YAML files and deploy nodes||Use of Pods and deployments|
|Users can encrypt data between nodes||All Pods can interact with each other|
|We don’t have any GUI||We have kubernetes dashboard|
|Cannot do Autoscaling||Can do autoscaling|
|Installation is easy but cluster is not too strong||Installation is difficult but cluster is very strong|
When you deploy/Install Kubernetes, you are ultimately creating a Kubernetes cluster that contains mainly two components (1) Master nodes and (2) Worker nodes. Nodes are the machines that contain their own Linux environment, which could be a virtual machine or either physical machine.
The application and services are deployed on the containers in the Pods. Pods contain one or more containers, such as Docker containers. When a Pod runs multiple containers, the containers are managed as a single entity and share the Pod’s resources.
Master Node or Control Panel Components
The master Node or Control Panel controls the cluster with state of cluster and data inside the cluster. It also responds to cluster events, schedules new Pod deployments etc. It contains various components, including a Kube-apiserver, an etcd storage, a Kube-controller-manager, and a Kube-scheduler. First, let’s learn each of the components that are present in the Master Node.
Kube-apiserver: The core of Kubernetes’ control plane is the API server. It exposes Kubernetes API’s and acts a gateway or an authenticator. It connects with Worker node and other control panel components. The Kubernetes API lets you query and manipulate the state of API objects in Kubernetes (for example: Pods, Namespaces, ConfigMaps, and Events). Most operations can be performed through the kubectl command-line interface or other command-line tools, such as kubeadm, which in turn use the API.
etcd: It stores all the cluster data in key-value format , just as a dictionary. Its store cluster states , pod states etc. Etcd holds two types of states, one is desired and the other is the current state for all resources and keeps them in sync.
Kube-scheduler: It helps in scheduling the new Pods and containers according to the health of cluster , pod’s resource demands, such as CPU or memory, before allocating the pods to the worker node of the cluster.
Kube-controller-manager: It runs the controller process i.e. Control Panel. It has subparts inside it but they all work as a single binary. It keeps checking the status and state of Pods, if anything is impacted it connects with scheduler to create or schedule new Pods which is then finally created by Kubelet.
- Node Controller: Checks the status of Node, when nodes get up and down.
- Replication controller: Maintains correct number of pods.
- Endpoint controller: Providers endpoints of pods and services.
- Service and token controller: Create Accounts and API access tokens.
Worker Node Components
Worker Node is also part of a Kubernetes cluster that is used to manage and run containerized applications. The worker node performs any actions when any request is triggered from Kube-API-server, the Kubernetes API server that runs on the Master node.
The Worker node contains various components, including a Kubelet, Kube-proxy, container runtime, and Node components run on every node, maintaining the details of all running pods.
Kubelet: kubelet is an agent that runs on each worker node and manages all containers in the pod, and also makes sure that each worker node communicates with the Kubernetes API server and further with etcd to ensure the containers in a pod are running by validating the states stored in etcd.
Kube-proxy: Kube-proxy is a networking component that runs on each worker node in your Kubernetes cluster that forwards traffic to handle network communications both outside and inside the cluster.
Container-runtime: This is runtime which is responsible for running containers. ( For Ex: Docker, contained etc.)
Install Kubernetes cluster
By Now, you should have a good theoretical understanding about the kubernetes and kubernetes cluster. Lets understand it practically as well by installing the kubernetes cluster.
- Two Linux machine one for Master/Control Panel node and other for worker node.
- Container runtime such as docker. If you don’t have install docker from https://automateinfra.com/docker-installation-on-ubuntu. Make sure to Install Docker on both the machines.
- There are three types of container runtimes docker , containerd and CRI-O which is an implementation of the Kubernetes CRI (Container Runtime Interface) to enable using OCI (Open Container Initiative) compatible runtimes. It is a lightweight alternative to using Docker as the runtime for kubernetes.
- Once docker is setup on both the machines run
service docker statuscommand to verify.
service docker status
- Next, on both the Linux machines make sure Inbound and outbound rules all open to the world as this is the demonstration. In production environment for control Panel and worker node you have to open many other ports as mentioned below. But for this tutorial feel free to skip this step.
- For Master 6443,10250/10251/10252 2379,2380 [All Inbound]
- For Worker Node 30000-32767 [All Inbound]
- Install transport-https and curl package using
apt-get installcommand. Transport-https package allows the use of repositories accessed via the HTTP Secure protocol and curl allows you to transfer data to or from a server or download etc.
sudo apt-get update && sudo apt-get install -y apt-transport-https curl
- Add the GPG key for the official kubernetes repository to your system using
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
- Add the kubernetes repository to APT sources and update the system.
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list deb https://apt.kubernetes.io/ kubernetes-xenial main EOF sudo apt-get update
Install Kubectl( which manages cluster) ,
kubeadm(which starts cluster) and
kubelet( which manages Pods and containers) on both the machines.
sudo apt-get install -y kubelet kubeadm kubectl sudo apt-mark hold kubelet kubeadm kubectl
If you don’t specify the runtime, then kubeadm automatically detects an installed container. For Docker runtime the Path to Unix socket is
/var/run/docker.sock & for containerd it’s
Initialize Kubernetes cluster
Now that you have Kubernetes installed on both your master node and worker node. But unless you initialize it, it is doing nothing. Kubernetes is initialized on the master node; let’s do it.
- Initialize your Cluster using
Kubeadm initcommand on Master node i.e contol panel node.
In the below command
--apiserver-advertise-address is the address of the API server, which is your master node itself, so add the IP address of your Master node.
-pod-network-cidr Specifies the range of IP addresses for the pod network. If set, the control plane will automatically allocate CIDRs for every node.
kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.111.4.79
- Once your master node that is control panel is started that is initialized , run the below commands on Master Node to run Kubernetes cluster with a regular user.
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
- Next, you need to join the worker node by running the commands provided from the output of the
kubeadm initcommand as shown below.
kubeadm join IP:6443 --token:.......................................
- After running the command, worker node joins the control panel successfully.
- Now, verify the nodes on your master node by running the
kubectlcommand as below.
kubectl get nodes
- You will notice here that, the status of both the nodes is NotReady because there is no networking configured between both the nodes. To check the network connectivity run kubectl command as shown below.
kubectl get pods --all-namespaces
- Below, you can see that
coredns podis in Pending which configures network connecting between both the nodes. In order to configure the networking they must be in Running status.
To fix the networking issue , you will need to Install a Pod network on the cluster so that your Pods can talk to each other. Lets do that !!
Install a Pod network on the cluster
To establish the network connectivity between two nodes, you need to deploy a pod network on the Mater node, and one of the most widely used pod networks is Flannel. Let’s deploy it with the kubectl apply command.
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
- After running this command you will see below output.
podsecuritypolicy.policy/psp.flannel.unprivileged created clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/flannel created serviceaccount/flannel created configmap/kube-flannel-cfg created daemonset.apps/kube-flannel-ds created
- Now re run kubectl commands to verify if both the nodes are in ready status and coredns pod is running
kubectl get nodes kubectl get pods --all-namespaces
Customizing the Control Panel
Now that you have successfully installed and started kubernetes cluster but if you wish to start your cluster in customized way then use
extraArgs inside the configuration file that you pass while running
Kube init command. This extraArgs will override the default flags passed to control plane components such as the APIServer, ControllerManager and Scheduler in the cluster.
- To customize your control panel default configuration create a
config_file.yamlfile as below.
# In case of APIServer apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterConfiguration kubernetesVersion: v1.16.0 apiServer: extraArgs: advertise-address: 192.168.0.103 anonymous-auth: "false" enable-admission-plugins: AlwaysPullImages,DefaultStorageClass audit-log-path: /home/johndoe/audit.log # In case of Scheduler apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterConfiguration kubernetesVersion: v1.16.0 scheduler: extraArgs: config: /etc/kubernetes/scheduler-config.yaml extraVolumes: - name: schedulerconfig hostPath: /home/johndoe/schedconfig.yaml mountPath: /etc/kubernetes/scheduler-config.yaml readOnly: true pathType: "File" # In case of ControllerManager apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterConfiguration kubernetesVersion: v1.16.0 controllerManager: extraArgs: cluster-signing-key-file: /home/johndoe/keys/ca.key bind-address: 0.0.0.0 deployment-controller-sync-period: "50"
- After you create the YAML file, run it with
kubeadm initcommand to customize the Control Panel default configurations as shown below.
kubeadm init --config config_file.yaml
Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces. Namespaces provide a scope for names. Names of resources need to be unique within a namespace, but not across namespaces. Kubernetes namespaces help different projects, teams, or customers to share a Kubernetes cluster.
- To List the current namespaces in a cluster using.
kubectl get namespaces
- To get detailed information about the namespaces.
kubectl describe namespaces
- To create namespace using
kubectl create namespace namespace-name
- To delete namespace using
kubectl delete namespaces namespace-name
Highly Available Cluster setup
Now that you have good idea and knowledge about the cluster setup, but it is important to have a highly scalable cluster in your environment. There are two ways of scaling your cluster.
WAY1: etcd are co-located with control panel nodes and have a
WAY2: etcd run on separate nodes from the control panel nodes and have a
external stacked etcd.
WAY 1 : etcd are co-located with control panel
As you learnt Control Panel node contains api server, scheduler, controller manager and etcd. In case of etcd are co-located with control panel all the three components api server, scheduler, controller manager communicates with etcd separately that means each control panel gets a dedicated etcd. Stacked etcd is just like another layer above each control Panel.
In this case if any node gets down both the components are down i.e. api processor and etcd. To solve this add more nodes to make it Highly Available. This approach requires less infrastructure
WAY 2 : etcd run on separate nodes from the control panel
In second case of etcd running on separate nodes with control panel all the three components api server, scheduler, controller manager communicates with etcd externally with a external stacked etcd.
In this case if any node gets down, then your etcd is not impacted that way you are having more High Available environment than stacked etcd but this approach requires more infrastructure