Kubernetes Health Checks: Liveness, Readiness, and Startup Probes Explained
Kubernetes health probes are one of the most misunderstood features in the ecosystem. Teams configure them cargo-cult style from Stack Overflow snippets and then wonder why their rolling deployments cause brief outages.
Kubernetes probes prevent bad pods from serving traffic, but misconfigured probes cause more downtime than they prevent. Here is how to get them right.
The three probes and what they actually do
Liveness probe — Answers: is this pod stuck? If it fails, Kubernetes kills and restarts the pod. Use this for detecting deadlocks.
Readiness probe — Answers: is this pod ready to receive traffic? If it fails, the pod is removed from the Service endpoint list but not killed.
Startup probe — Answers: has this pod finished initializing? Disables liveness/readiness checks until it passes. Use this for slow-starting applications.
The most common misconfiguration
Using a liveness probe that checks external dependencies (database, Redis) is a mistake. If your database is down, every pod will restart in a loop — making recovery impossible.
Liveness probes should only check internal health (is the event loop responsive? is a deadlock present?).
Readiness probes should check external dependencies.
Configuring probe thresholds
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
periodSeconds: 5
failureThreshold: 2
Set `failureThreshold` high enough to absorb transient errors.
External monitoring alongside probes
Kubernetes probes operate inside the cluster. Pair them with external uptime monitoring on AlertsDock to catch issues that Kubernetes cannot see — DNS failures, ingress misconfiguration, or cloud load balancer problems.
Startup probes for slow applications
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
This gives the app 300 seconds to start. Once the startup probe succeeds, liveness and readiness probes take over.
Feature Guide
Uptime Monitoring
AlertsDock gives teams uptime monitoring for websites, APIs, TCP checks, DNS checks, SSL expiry, and fast alert routing without enterprise overhead.
Read guideAlternative Page
UptimeRobot Alternative
Compare AlertsDock with UptimeRobot for teams that want uptime monitoring plus heartbeat monitoring, status pages, webhook inspection, and per-resource alert routing.
See comparisonMore articles
Frontend Monitoring: Real User Monitoring vs Synthetic Testing
Backend uptime checks miss the browser. Real user monitoring shows you what actual users experience — slow renders, JavaScript errors, and failed resource loads that your API monitors never see.
API Gateway Monitoring: Seeing What Happens Before Your Code Runs
Your API gateway processes every request before it reaches your service. Rate limits, auth failures, and routing errors all happen there — and most teams have zero visibility into them.
Monitoring AI Workloads: LLM APIs, Inference Costs, and Timeout Handling
LLM API calls can take 30 seconds and cost $0.10 each. When they fail, they fail silently in ways traditional monitoring was never designed to catch.