Is Monitoring a Challenge in Kubernetes?

0
20

On 7th June 2014, Google Cloud announced a new application management technology called Kubernetes. Its design was heavily influenced by Google’s Borg system, as many top contributors had initially worked on this system. The Borg system was originally designed as a cluster manager. Eric Brewer, Vice President Infrastructure, Google, talked about Kubernetes at DockerCon in the same month, and the world soon took a note of it.

In November 2014, there was an alpha launch of Google Cloud’s Google Kubernetes Engine (GKE) introducing managed Kubernetes. It led to an increased innovation around the Orchestration system. It consequently made Google, Red Hat, and many others in the community increase their investments in Kubernetes.

Why do we need Kubernetes?

Kubernetes is an open-source container-orchestration system used for automating application deployment, scaling, and management. It is platform-independent so that users can use it across cloud, hybrid data centers, or on-premise. At the outset, it was designed by Google and is now maintained by the Cloud-native ComputingFoundation.  

Businesses today use containerized microservices to build small and interconnected application components, rather than just building monolithic apps. Containerization makes it easier to update applications, in contrast to the traditional applications that required complete rewrites. So each microservice needs to be individually configured, deployed, and monitored when compared to the number of applications, further resulting in the exponential rise of the microservices. Manually managing each microservice becomes inflexible. Thus, enterprises undoubtedly need to automate them for ensuring ease in maintaining them.

With its growing implementation, many critical business applications and services were brought on microservices, and managing this infrastructure was a huge challenge. Kubernetes then came into the picture, making it easy to both create and manage containers and clusters and assist in managing thousands of containers.

According to the Continuous Intelligence Report for 2019, the percentage of businesses adopting containers has grown to 30%. With 20% of the businesses adopting Kubernetes on AWS alone, while 59% of those are running on a combination of AWS and Google Cloud. For businesses running on either or all of these cloud providers, namely, Azure, Amazon, Google Cloud, the adoption of Kubernetes was up to 80%.

Kubernetes has assisted businesses to implement the idea of, and manage multi-cloud. Using Kubernetes, businesses can deploy the same container image across multiple platforms. It is a complete application-level infrastructure, but its maintenance, monitoring, and management is a bigger challenge. Its nodes, pods, and even complete clusters can all be destroyed and rebuilt to fulfill the constant demands at the application level.Kubernetes monitoring is relatively more crucial because of the rising instances of cloud sprawl. IT teams need to be on vigil to maintain control over the infrastructure and the costs involved.

Challenges in monitoring

Kubernetes capture different forms of data. Log data from an application provides insights into the processes taking place, and the metric data essentially gives insights into the overall experience delivered by the application.

Combining log and metric data should give a complete picture of the application, but the combination is not as easy as it may sound. It’s close to impossible for an organization to connect the metrics from node to logs to a pod in that node. The correlation is kinda tricky because the metadata tagging of the collected data might be inconsistent. The inherent challenge might be the metric tagging, with the pod and the cluster from where it is collected, or a log categorized using a different name. These permutations and combinations are delicate and difficult to decipher.

The traditional monolithic environment only needed a search through a log. But microservices is a whole new ball game. One must search through a large number of logs, as for each microservice, there are one or more logs. And scanning through numerous logs from several services is time-consuming and often does not help in reaching the source of the problem.

Tracing headers helps troubleshooting microservices. However, adding these headers requires a code change, which is a task. Even if we are sure of the percentage or number of failed microservices, we still have to go through logs to discover why.

Monitoring Kubernetes

Monitoring solutions for Kubernetes are not any different from traditional monitoring tools. Some of the most used APM tools are Zipkin and Prometheus that enable monitoring and tracking. These tools provide insights about the microservices’ resources’ consumption and the transaction flow through the system.

Here are some of the reliable and popular open source tools that can be used while working with Kubernetes,

Kubelet

Kubelet connects the master (API server) and the nodes (pod). It watches PodSpecs via the Kubernetes API server and collects resource utilization statistics, pods, and events’ status. The combined pod resource usage statistics open source via a REST API.

Container Advisor

Container Advisor, also known as cAdvisor, auto-discovers all the containers on the machine, and collects data about the memory, network usage, file system, and CPU. An agent between the container resource usage and performance, it has native support for Docker containers and acts at an individual node level. It is a limited functionality tool. Therefore, if you are planning to perform a complex monitoring action, it will not suit any of your needs.

Kube-state-metrics

Kube-state-metrics combs through many Kubernetes API servers and then generates metrics on numerous Kubernetes objects, configuration maps, nodes, and pods. The displayed metrics are unmodified as opposed to other kubectl metrics, which use the same Kubernetes API but apply analytics that demonstrate a comprehensible message.  

Kubernetes dashboard

Kubernetes dashboard provides several features for the developers to manage and create various workloads. It can perform multiple discovery options such as load balancing, configuration, storage, and monitoring, but it can only assist small clusters.

Prometheus

Prometheus is one of the most popular tools in Kubernetes used for monitoring, and which stores all the data in time series. This data is queried via the PromQL query language and visualized with a built-in expression browser. The monitoring tool relies on Grafana for visualizing data. Prometheus can be installed directly on the host or as a Docker container. Prometheus Operator assists in installing Prometheus over Kubernetes.

Investing in the right tool is essential for complete monitoring and maintaining system tranquility. Many open-source tools facilitate monitoring, managing, tracking, and troubleshooting. There has been an increase in observability with various monitoring tools. Each of these monitoring tools provides insights, but all of them need to be combined to get a complete picture.

Automating the process of data gathering and correlating them across multiple tools can help to understand data better. Having a correlated dashboard for all the different sets of data together can provide an improved picture of what is taking place in a specific container or pod. A business can use complete insights about various microservices’ usage and even see how it is performing over time. Application conditions might be dynamic, but having each of them means the resource utilization is continuously monitored.

In a nutshell

Kubernetes is the most widely adopted solution for container development and management. What makes it more compelling is that it’s a complete open-source solution that can be used with any vendor. Its monitoring challenge can be reduced with the aforementioned tools, but nailing its root cause is still miles away. If the monitoring tools are implemented, it will be easier to trace the origin of the error in the services. However, the developers still need to discover the ‘why’ of the error.

LEAVE A REPLY

Please enter your comment!
Please enter your name here