Kubernetes Health Check and Auto Restart
Introduction
When you deploy an application to a production environment, various issues can cause it to stop working. These could be code bugs, database problems, or external service issues. Each problem requires a different solution. However, if you’re using Kubernetes to deploy your application and want it to automatically restart when an issue occurs, this article is for you.
Prerequisites
Before proceeding, ensure you have:
- A Kubernetes cluster set up. You can use Google Kubernetes Engine or set up a local Kubernetes cluster with Kind.
- Knowledge of Kubernetes, specifically how to create Deployments and Services.
Kubernetes Probes
In this article, I'll guide you through using three types of probes to check the status of your application:
1. Startup Probe
- As the name suggests, this probe runs when the application starts. It ensures the container has started successfully. Only after the Startup Probe succeeds do the Readiness and Liveness Probes execute.
2. Readiness Probe
- This probe is similar to the Startup Probe. While the Startup Probe ensures the container has started, it doesn’t mean the application is ready to use. The application might need a successful database connection or other services ready. The Readiness Probe checks these dependencies.
- It runs throughout the container’s lifecycle. Pods that don’t meet the conditions set by the Readiness Probe are removed from the service endpoint and won’t receive traffic. This probe helps direct traffic to ready Pods.
3. Liveness Probe
- This probe ensures the application is always running. It runs throughout the container’s lifecycle, and if the probe fails, the container automatically restarts.
Example Usage
Below is a code block to set up a NodeJS server with the necessary APIs:
- /healthz: Used for the Startup probe, executed only when the container starts. After a successful run, it moves on to execute the next probes.
- /readiness: Used for the Readiness probe, executed throughout the application's lifecycle. If a Pod fails, it will be deleted, and traffic will be redirected to ready Pods.
- /liveness: Used for the Liveness probe, executed throughout the application's lifecycle. If this test fails, the application will be restarted.
- /crash: When this API is called, the app will crash. This is used to test auto-restart.
As for the setTimeout part, it's used to simulate a delay when the app starts. In a real-world scenario, you wouldn't need to use it.
Next, create a file named `deployment.yml` with the following content:
You'll notice that the configurations for `startupProbe`, `readinessProbe`, and `livenessProbe` are quite similar. Each requires defining `httpGet` which includes the `path` (the API endpoint) and the `port` (the service port).
- failureThreshold: This is the number of allowed failures. It means if an API call fails, it will retry after the `periodSeconds` interval. However, if the number of failed attempts exceeds the `failureThreshold`, the container will restart.
- periodSeconds: This is the interval between each execution.
- initialDelaySeconds: This is the time delay before starting execution after the container starts. It is used only in `readinessProbe` and `livenessProbe`.
- successThreshold: This is the number of successful tests required. The API must pass this number of times before the Pod is considered ready to use.
After setting up the configurations, apply them to create the resource.
The result look like:
Use the EXTERNAL-IP to access the application. Make sure everything is running smoothly by calling the /healthz, /readiness, and /liveness APIs. Then, call the /crash API to intentionally crash the app and check if the Pod restarts correctly.
After that, you can access the application as usual.
If you have any suggestions or questions about this article, please feel free to leave a comment below!
Comments
Post a Comment