Monitoring30 January 20255 min read

Database Monitoring: The Metrics That Actually Matter

Your application can be fully healthy while your database silently accumulates long-running queries, fills up its connection pool, or grows past the storage threshold. Database monitoring isn't optional — it's what catches the class of failures that application monitoring misses.

MonitoringUptime MonitoringWebsite MonitoringApi MonitoringCron Job Monitoring

Monitoring

The 6 metrics that matter

1. Query latency (p95/p99) — the percentile matters more than the average. A 500ms average can hide 10-second outliers.

2. Connection pool utilization — at 80%+ you're at risk of connection refused errors. At 100% your app starts timing out.

3. Replication lag (if using replicas) — lag >10s means reads from replicas are stale. Lag growing over time signals a problem.

4. Slow query count — queries taking >1s per minute. Trend matters more than absolute count.

5. Disk usage % — alert at 70%, act before 85%.

6. Lock wait time — excessive lock contention is a sign of schema or query design problems.

External database health endpoint

Create a lightweight health endpoint that runs a simple query: ```python @app.route('/health/db') def db_health(): try: db.execute('SELECT 1') return {'status': 'ok'}, 200 except Exception as e: return {'status': 'error', 'detail': str(e)}, 503 ```

Monitor this endpoint on AlertsDock with a 1-minute interval.

Monitoring database backups

Use a heartbeat cron monitor to verify backups run: ```bash # In your backup script: pg_dump mydb | gzip > backup.gz && \ curl -fsS https://alertsdock.com/ping/{uuid} ```

If the ping doesn't fire, the backup failed — you'll know within the configured interval.

Query performance baselining

Record p99 query latency daily. When a deploy causes a regression, you have historical data to compare against and can roll back with confidence.

Storage growth monitoring

Storage failures give you warning signs. Watch growth rate — if your database is growing 5% per week, you can calculate exactly when you'll hit 85% capacity and provision more storage in advance.

This article is available across the supported locale routes — use the language switcher above to change.

Feature Guide

Uptime Monitoring

AlertsDock gives teams uptime monitoring for websites, APIs, TCP checks, DNS checks, SSL expiry, and fast alert routing without enterprise overhead.

Read guide

Alternative Page

UptimeRobot Alternative

Compare AlertsDock with UptimeRobot for teams that want uptime monitoring plus heartbeat monitoring, status pages, webhook inspection, and per-resource alert routing.

See comparison

AlertsDock Team

30 January 2025

Try AlertsDock free

Monitoring

Frontend Monitoring: Real User Monitoring vs Synthetic Testing

Backend uptime checks miss the browser. Real user monitoring shows you what actual users experience — slow renders, JavaScript errors, and failed resource loads that your API monitors never see.

Monitoring

API Gateway Monitoring: Seeing What Happens Before Your Code Runs

Your API gateway processes every request before it reaches your service. Rate limits, auth failures, and routing errors all happen there — and most teams have zero visibility into them.

Monitoring

Monitoring AI Workloads: LLM APIs, Inference Costs, and Timeout Handling

LLM API calls can take 30 seconds and cost $0.10 each. When they fail, they fail silently in ways traditional monitoring was never designed to catch.

Database Monitoring: The Metrics That Actually Matter

The 6 metrics that matter

External database health endpoint

Monitoring database backups

Query performance baselining

Storage growth monitoring

Uptime Monitoring

UptimeRobot Alternative

More articles

Frontend Monitoring: Real User Monitoring vs Synthetic Testing

API Gateway Monitoring: Seeing What Happens Before Your Code Runs

Monitoring AI Workloads: LLM APIs, Inference Costs, and Timeout Handling