In this article, I will demonstrate how we can build an Elastic Airflow Cluster which scales-out on high load and scales-in, safely, when the load is below a threshold.

Scroll to setup if you want to test it out first.

Autoscaling in Kubernetes is supported via Horizontal Pod Autoscaler. Using HPA, scaling-out is straight forward, HPA increases replicas for a deployment and additional workers are created to share the workload. However, scaling-in is where the problem comes, scale-in process selects pods to be terminated by ranking them based on their co-location on a node.

In this article I will take you through demo of a Horizontally Auto Scaling Redis Cluster with the help of Kubernetes HPA configuration.


  • I am using minikube for demo purpose, but the code and yaml can be used in a Kubernetes cluster with little to none modifications.
  • This is not a production ready setup. It is only a prototype.


Redis Cluster provides a way to shard data automatically between multiple nodes. This enables properties like horizontal scaling, high availability and partition tolerance in a system. More info about redis cluster is here.

Horizontal Pod Autoscaler automatically scales pods in a deployment/replication controller/replica set/stateful set based on current metrics. Metrics can be cpu, memory or custom metrics based on application properties. …

Redis is an in-memory key value data store which is used as a database, cache and message broker.

In this blog we will look at how to deploy Redis HA cluster in kubernetes.

Why High Availability?

Reliability is one of the key fundamentals for designing a system. Reliability means that our system works correctly even if some fault has happened.

One of the most common fault is server fail-stop and to be tolerant of this fault , most intuitive approach is to replicate the services over multiple servers and to provide a coherent interface to clients, abstracting all the complexities underlying our system. Therefore, if any server goes down, the request can be served by another server. …

We know that building a deep learning model for real world problems requires lots of training data and when problems grow in complexity, usually, model’s complexity and training data also grows. This means training time on a single machine will increase and become unacceptable when we want to quickly build and evaluate multiple models with different hyper parameters.

Adding more GPUs to a machine is always an option, however, there is a limit to vertically scaling the machine. There comes a point when scaling out horizontally makes more sense and gives more throughput.

In this article I will setup and run a demo pytorch distributed training on minikube cluster. …


Image for post
Image for post
Sample Image

As the name suggest Natural Language Processing ( NLP ) is processing of languages we human speak to communicate with each other. NLP is the practice to understand and derive knowledge out of these languages in such a way that computer can perform tasks by just understanding what we speak or write as if we were speaking to other human being.

Let’s say we are given a sentence “A cop is chasing a man in the streets”. In NLP we will generally, perform following steps, to collect varied meanings of this sentence,

  • Lexical Analysis: It is also called POS Tagging. This entails marking each word in a sentence with noun, verb adjective etc. …


Sarwesh Suman

Senior Software Engineer @WalmartLabs

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store