Kubernetes Series (PART-1)

TABLE OF CONTENT TABLE OF CONTENT:

  1. Introduction to Kubernetes
  2. Types of Application Deployment
  3. Features of kubernetes
  4. Docker Swarm v/s Kubernetes
  5. Kubernetes Cluster
  6. Master Node or Control Panel Components
  7. Worker Node Components
  8. Install Kubernetes cluster
  9. Initialize Kubernetes cluster
  10. Install a Pod network on the cluster
  11. Customizing the Control Panel
  12. Kubernetes namespaces
  13. Highly Available Cluster setup

Introduction to Kubernetes

Kubernetes is an open source container orchestration engine for automating deployments, scaling and managing the containers applications. Kubernetes is an open source Google based tool. It is also called as k8s because there are eight letters between the “K” and the “s” alphabet.

Kubernetes is more than just management of containers like it keeps the load balanced between the cluster nodes, provides a self-healing mechanism such as when any containers gets down then it replaces with a new healthy container , service discovery such as exposing a container using the DNS name or using their own IP address, provides a container runtime, zero downtime deployment capabilities, automatic rollback, automatic storage allocation such as local storages, public cloud providers etc.

Types of Application deployment’s

Physical Server to Virtualization to Containerization
  • Physical server: Apps which were running on Physical server had issues related to resource allocations like if you have 5 apps running on physical server and 1 app consumes lot of CPU or memory than other 4 apps use to suffer. In order to fix the issue you would require more physical servers. But this approach is too expensive.
  • Virtualization'sIn Virtualization you can isolate applications. You can have different VM’s from a particular hardware and then deploy OS and applications on top of it. It allows better utilization of resources and scale easily and saves hardware cost. This technology is still being utilized by lot of companies now.
  • Container deployment: Coming from Virtualizations era containerizations follows light weighted and portable deployments where containers share same OS, CPU, memory , have their own file systems and that’s why deployments work in same way such as local machines or on cloud infrastructure.

Features of kubernetes

Kubernetes is more than just managing containers because of the features it supports such as the load-balanced between the cluster nodes, provides a self-healing container mechanism, service discovery, provides a container runtime, zero downtime deployment capabilities, automatic rollback, etc.

It has ability to scale when needed that is known as AutoScaling. You can manage configurations like secrets or passwords and also mount EFS or other storage when required automatically.

Docker Swarm v/s Kubernetes

Docker SwarmKubernetes
Use of YAML files and deploy nodesUse of Pods and deployments
Users can encrypt data between nodesAll Pods can interact with each other
We don’t have any GUIWe have kubernetes dashboard
Cannot do AutoscalingCan do autoscaling
Installation is easy but cluster is not too strongInstallation is difficult but cluster is very strong

Kubernetes Cluster

When you deploy/Install Kubernetes, you are ultimately creating a Kubernetes cluster that contains mainly two components (1) Master nodes and (2) Worker nodes. Nodes are the machines that contain their own Linux environment, which could be a virtual machine or either physical machine.

The application and services are deployed on the containers in the Pods inside the Worker node. Pods contain one or more containers, such as Docker containers. When a Pod runs multiple containers, the containers are managed as a single entity and share the Pod’s resources.

Kubernetes Cluster

Master Node or Control Panel Components

The master Node or Control Panel controls the cluster with state of cluster and data inside the cluster. It also responds to cluster events, schedules new Pod deployments etc. It contains various components, including a Kube-apiserver, an etcd storage, a Kube-controller-manager, and a Kube-scheduler. First, let’s learn each of the components that are present in the Master Node.

  1. Kube-apiserver: The core of Kubernetes’ control plane is the API server. It exposes Kubernetes API’s and acts a gateway or an authenticator. It connects with Worker node and other control panel components. The Kubernetes API lets you query and manipulate the state of API objects in Kubernetes such as Pods, Namespaces, ConfigMaps, and Events from the etcd server. Most operations can be performed through the kubectl command-line interface or other command-line tools, such as kubeadm, which in turn use the API. Scheduler monitors the api server regularly.

Example: If any user runs the kubectl command then User ➜ Kube-apiserver (Authenticator) ➜ etcd ( reads the value) ➜ kube-apiserver

  1. etcd: It stores all the cluster data and information in key-value format , just as a dictionary. Its store cluster states, secrets , configs , pod states etc. Etcd holds two types of states, one is desired and the other is the current state for all resources and keeps them in sync. When you run kubectl get command then it queries etcd server and when you run kubectl add or update command then it adds it in the etcd server.
Key Value Store
Tabular/Relational database
  1. Kube-scheduler: It helps in scheduling the new Pods and containers according to the health of cluster , pod’s resource demands, such as CPU or memory, before allocating the pods to the worker node of the cluster. Scheduler monitors the api server regularly. Whenever controller manager finds any discrepancies in the cluster then it forwards the request to scheduler via kube api server.

Example: If there is any change in node or only pod is created without assigned node then : Scheduler ➜ kube-apiserver ➜ etcd (updates info) ➜ kube-apiserver ➜ kublet (Passes info) ➜ Assigns the node to the ➜Pod kube-apiserver ➜ etcd (updates info)

  1. Kube-controller-manager: It runs the controller process i.e. Control Panel. Kubernetes comes with a set of built-in controllers that run inside the kube-controller-manager. These built-in controllers provide important core behaviors.
    • Node Controller: Checks the status of Node, when nodes get up and down.
    • Replication controller: Maintains correct number of containers are running in replication group .
    • Endpoint controller: Providers endpoints of pods and services.
    • Service and token controller: Create Accounts and API access tokens.

In Kubernetes, controllers are control loops that watch the state of your cluster, then make or request changes where needed. Each controller tries to move the current cluster state closer to the desired state.

Worker Node Components

Worker Node is also part of a Kubernetes cluster that is used to manage and run containerized applications. The worker node performs any actions when any request is triggered from Kube-API-server which is part of Master Node. Each node is managed by the control plane or Master Node that contains the services necessary to run Pods.

The Worker node contains various components, including a Kubelet, Kube-proxy, container runtime, and Node components run on every node, maintaining the details of all running pods.

  1. Kubelet: kubelet is an agent that runs on each worker node and manages all containers in the pod, and also makes sure that each worker node communicates with the Kubernetes API server and further with etcd to ensure the containers in a pod are running by validating the states stored in etcd.
  2. Kube-proxy: Kube-proxy is a networking component that runs on each worker node in your Kubernetes cluster that forwards traffic to handle network communications both outside and inside the cluster.
  3. Container-runtime: This is runtime which is responsible for running containers. ( For Ex: dockerd, contained, rkt or docker etc.)

Worker Nodes are added either by kubelet service by self registration or added by humans or users. Lets look at creation of a Node from the following JSON code.

{
  "kind": "Node",
  "apiVersion": "v1",
  "metadata": {
    "name": "10.240.79.157",
    "labels": {
      "name": "my-first-k8s-node"
    }
  }
}

Connectivity between Master Node to Worker and Vice versa

Worker to Master Node

  • The apiserver is configured to listen for remote connections on a secure HTTPS port with client authentication enabled.
  • Worker Nodes should be provisioned with the public root certificate for the cluster such that they can connect securely to the apiserver along with valid client credentials.
  • Better way is that the client credentials provided to the kubelet are in the form of a client certificate.
  • Pods that wish to connect to the apiserver can do so securely by leveraging a service account.
  • Kubernetes service uses virtual IP address that is redirected to the HTTPS endpoint on the apiserver via kube-proxy.

Master Node to Worker

  • First way is to connect apiserver to the kubelet to fetch logs for pods, Attaching to running pods, Providing the kubelet port-forwarding functionality.
  • Second way is to connect apiserver to nodes, pods, and services can run on https secure port.

Install Kubernetes cluster

By Now, you should have a good theoretical understanding about the kubernetes and kubernetes cluster. Lets understand it practically as well by installing the kubernetes cluster.

Prerequisites

  • Two Linux machine one for Master/Control Panel node and other for worker node.
  • There are three types of container runtimes docker , containerd and CRI-O which is an implementation of the Kubernetes CRI (Container Runtime Interface) to enable using OCI (Open Container Initiative) compatible runtimes. It is a lightweight alternative to using Docker as the runtime for kubernetes.
  1. Once docker is setup on both the machines run service docker status command to verify.
service docker status
  1. Next, on both the Linux machines make sure Inbound and outbound rules all open to the world as this is the demonstration. In production environment for control Panel and worker node you have to open many other ports as mentioned below. But for this tutorial feel free to skip this step.
    • For Master 6443,10250/10251/10252 2379,2380 [All Inbound]
    • For Worker Node 30000-32767 [All Inbound]
  2. Install transport-https and curl package using apt-get install command. Transport-https package allows the use of repositories accessed via the HTTP Secure protocol and curl allows you to transfer data to or from a server or download etc.
sudo apt-get update && sudo apt-get install -y apt-transport-https curl
  1. Add the GPG key for the official kubernetes repository to your system using curl command.

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
  1. Add the kubernetes repository to APT sources and update the system.
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
sudo apt-get update
  1. Now Install Kubectl ( which manages cluster) , kubeadm (which starts cluster) and kubelet ( which manages Pods and containers) on both the machines.
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

If you don’t specify the runtime, then kubeadm automatically detects an installed container. For Docker runtime the Path to Unix socket is /var/run/docker.sock & for containerd it’srun/containerd/containerd.sock

Initialize Kubernetes cluster

Now that you have Kubernetes installed on both your master node and worker node. But unless you initialize it, it is doing nothing. Kubernetes is initialized on the master node; let’s do it.

  • Initialize your Cluster using Kubeadm init command on Master node i.e contol panel node.

In the below command --apiserver-advertise-address is the address of the API server, which is your master node itself, so add the IP address of your Master node. -pod-network-cidr Specifies the range of IP addresses for the pod network. If set, the control plane will automatically allocate CIDRs for every node.

kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.111.4.79
MASTER NODE STARTS THE CLUSTER AND ASK YOU TO JOIN YOUR WORKER NODE
  • Once your master node that is control panel is started that is initialized , run the below commands on Master Node to run Kubernetes cluster with a regular user.
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
  • Next, you need to join the worker node by running the commands provided from the output of the kubeadm init command as shown below.
kubeadm join IP:6443 --token:.......................................
  • After running the command, worker node joins the control panel successfully.
WORKER NODE JOINS THE CLUSTER
  • Now, verify the nodes on your master node by running the kubectl command as below.
kubectl get nodes
  • You will notice here that, the status of both the nodes is NotReady because there is no networking configured between both the nodes. To check the network connectivity run kubectl command as shown below.
kubectl get pods --all-namespaces
  • Below, you can see that coredns pod is in Pending which configures network connecting between both the nodes. In order to configure the networking they must be in Running status.

To fix the networking issue , you will need to Install a Pod network on the cluster so that your Pods can talk to each other. Lets do that !!

Install a Pod network on the cluster

To establish the network connectivity between two nodes, you need to deploy a pod network on the Mater node, and one of the most widely used pod networks is Flannel. Let’s deploy it with the kubectl apply command.

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
  • After running this command you will see below output.
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
  • Now re run kubectl commands to verify if both the nodes are in ready status and coredns pod is running
kubectl get nodes
kubectl get pods --all-namespaces
NETWORKING IS ALMOST SET NOW

Customizing the Control Panel

Now that you have successfully installed and started kubernetes cluster but if you wish to start your cluster in customized way then use extraArgs inside the configuration file that you pass while running Kube init command. This extraArgs will override the default flags passed to control plane components such as the APIServer, ControllerManager and Scheduler in the cluster.

  • To customize your control panel default configuration create a config_file.yaml file as below.
# In case of APIServer

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.16.0
apiServer:
  extraArgs:
    advertise-address: 192.168.0.103
    anonymous-auth: "false"
    enable-admission-plugins: AlwaysPullImages,DefaultStorageClass
    audit-log-path: /home/johndoe/audit.log



# In case of Scheduler

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.16.0
scheduler:
  extraArgs:
    config: /etc/kubernetes/scheduler-config.yaml
  extraVolumes:
    - name: schedulerconfig
      hostPath: /home/johndoe/schedconfig.yaml
      mountPath: /etc/kubernetes/scheduler-config.yaml
      readOnly: true
      pathType: "File"

# In case of ControllerManager

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.16.0
controllerManager:
  extraArgs:
    cluster-signing-key-file: /home/johndoe/keys/ca.key
    bind-address: 0.0.0.0
    deployment-controller-sync-period: "50"

  • After you create the YAML file, run it with kubeadm init command to customize the Control Panel default configurations as shown below.
kubeadm init --config config_file.yaml

Kubernetes namespaces

Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces. Namespaces provide a scope for names. Names of resources need to be unique within a namespace, but not across namespaces. Kubernetes namespaces help different projects, teams, or customers to share a Kubernetes cluster. Namespaces are a way to divide cluster resources between multiple users.

When you create a Service, it creates a corresponding DNS entry. This Service syntax once created should look lik e <service-name>.<namespace-name>.svc.cluster.local, which means that if a container only uses <service-name>, it will resolve to the service which is local to a namespace.

Most Kubernetes resources (e.g. pods, services, replication controllers, and others) are in some namespaces.

  • To List the current namespaces in a cluster using.
kubectl get namespaces
  • To get detailed information about the namespaces.
kubectl describe namespaces
  • To create namespace using kubectl command.
kubectl create namespace namespace-name
  • To delete namespace using kubectl command.
kubectl delete namespaces namespace-name

Highly Available Cluster setup

Now that you have good idea and knowledge about the cluster setup, but it is important to have a highly scalable cluster in your environment. There are two ways of scaling your cluster.

WAY1: etcd are co-located with control panel nodes and have a stacked etcd.

WAY2: etcd run on separate nodes from the control panel nodes and have a external stacked etcd.

WAY 1 : etcd are co-located with control panel

As you learnt Control Panel node contains api server, scheduler, controller manager and etcd. In case of etcd are co-located with control panel all the three components api server, scheduler, controller manager communicates with etcd separately that means each control panel gets a dedicated etcd. Stacked etcd is just like another layer above each control Panel.

In this case if any node gets down both the components are down i.e. api processor and etcd. To solve this add more nodes to make it Highly Available. This approach requires less infrastructure

WAY 2 : etcd run on separate nodes from the control panel

In second case of etcd running on separate nodes with control panel all the three components api server, scheduler, controller manager communicates with etcd externally with a external stacked etcd.

In this case if any node gets down, then your etcd is not impacted that way you are having more High Available environment than stacked etcd but this approach requires more infrastructure

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s