🧠 Top 20 PostgreSQL Health Check Scripts (with Explanations)

PostgreSQL performs beautifully — but only when it’s healthy.
This collection of 20 advanced health check scripts helps you monitor CPU usage, locks, replication, I/O, cache, and more.

Each script includes its skill level, filename, and usage tips.
You can automate or schedule them to run daily for continuous insights.

1. Database Size Overview 🟢 Beginner

Filename: db_size_overview.sql

Usage: List total size of all databases.

SELECT datname AS database_name,
       pg_size_pretty(pg_database_size(datname)) AS size
FROM pg_database
ORDER BY pg_database_size(datname) DESC;

Explanation: Quickly identifies which databases consume the most storage.

2. Largest Tables 🟢 Beginner

Filename: largest_tables.sql

Usage: Detect tables growing too fast.

SELECT relname AS table_name,
       pg_size_pretty(pg_total_relation_size(relid)) AS total_size
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC
LIMIT 20;

Tip: Big tables may need partitioning or archiving.

3. Index Usage Efficiency 🟡 Intermediate

Filename: index_usage_efficiency.sql

Usage: Evaluate how often indexes are used.

SELECT relname AS table_name,
       100 * idx_scan / (seq_scan + idx_scan + 1) AS index_usage_pct,
       idx_scan, seq_scan
FROM pg_stat_user_tables
ORDER BY index_usage_pct ASC
LIMIT 20;

Insight: Low index usage (<30%) → add or tune indexes.

4. Unused Indexes 🟡 Intermediate

Filename: unused_indexes.sql

Usage: Find indexes that are never used.

SELECT schemaname, relname AS table_name, indexrelname AS index_name, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY relname;

Recommendation: Remove unused indexes to speed up writes.

5. Table Bloat Estimate 🟡 Intermediate

Filename: table_bloat_estimate.sql

Usage: Detect wasted space due to dead tuples.

SELECT schemaname, relname,
       pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) AS wasted_space
FROM pg_catalog.pg_statio_user_tables
ORDER BY (pg_total_relation_size(relid) - pg_relation_size(relid)) DESC
LIMIT 10;

Explanation: High bloat = vacuum/autovacuum misconfiguration.

6. Autovacuum Activity 🟢 Beginner

Filename: autovacuum_activity.sql

SELECT relname AS table_name, n_tup_ins, n_tup_upd, n_tup_del,
       n_dead_tup, last_autovacuum
FROM pg_stat_all_tables
ORDER BY n_dead_tup DESC
LIMIT 10;

Tip: Old autovacuum timestamps = potential bloat risk.

7. Long-Running Queries 🟡 Intermediate

Filename: long_running_queries.sql

SELECT pid, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active' AND now() - query_start > interval '5 minutes'
ORDER BY duration DESC;

Explanation: Long queries block vacuum and slow down others.

8. Blocking Queries 🔴 Advanced

Filename: blocking_queries.sql

SELECT blocked.pid AS blocked_pid,
       blocked.query AS blocked_query,
       blocking.pid AS blocking_pid,
       blocking.query AS blocking_query,
       now() - blocked.query_start AS blocked_duration
FROM pg_locks blocked_locks
JOIN pg_stat_activity blocked ON blocked.pid = blocked_locks.pid
JOIN pg_locks blocking_locks
  ON blocking_locks.transactionid = blocked_locks.transactionid
 AND blocking_locks.pid != blocked_locks.pid
JOIN pg_stat_activity blocking ON blocking.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;

Insight: Identifies who’s blocking whom — great for debugging “hangs.”

9. Replication Lag 🟡 Intermediate

Filename: replication_lag.sql

SELECT application_name, state, sync_state,
       pg_current_wal_lsn() - replay_lsn AS replication_lag
FROM pg_stat_replication;

Tip: Monitor replica lag to prevent stale reads.

10. Cache Hit Ratio 🟢 Beginner

Filename: cache_hit_ratio.sql

SELECT sum(blks_hit) / nullif(sum(blks_hit) + sum(blks_read), 0) AS cache_hit_ratio
FROM pg_stat_database;

Ideal: >99% cache hit ratio = efficient memory usage.

11. CPU Usage per Backend 🔴 Advanced

Filename: cpu_usage_per_backend.sql

SELECT pid, usename, application_name,
       round((EXTRACT(EPOCH FROM (now() - backend_start))/60)::numeric, 2) AS runtime_min,
       state, query
FROM pg_stat_activity
WHERE now() - backend_start > interval '1 minute'
ORDER BY runtime_min DESC;

Note: PostgreSQL doesn’t directly expose CPU%, but long runtimes often correlate with CPU-heavy tasks.

12. OS CPU Load (Accurate via pg_stat_kcache)

Filename: os_cpu_load.sql

Level: 🔴 Advanced (requires pg_stat_kcache)

Requires: CREATE EXTENSION pg_stat_kcache;

SELECT datname,
       round(sum(cpu_user_time + cpu_sys_time) / 1000, 2) AS total_cpu_msec,
       round(sum(cpu_user_time) / 1000, 2) AS user_cpu_msec,
       round(sum(cpu_sys_time) / 1000, 2) AS sys_cpu_msec,
       sum(reads) AS total_reads,
       sum(writes) AS total_writes
FROM pg_stat_kcache
JOIN pg_database db ON db.oid = pg_stat_kcache.datid
GROUP BY datname
ORDER BY total_cpu_msec DESC
LIMIT 10;

Explanation:

cpu_user_time: → User-space CPU time used by backend queries.

cpu_sys_time: → Kernel-level CPU time used by PostgreSQL processes.

reads/writes: → I/O operations. This gives a real CPU usage breakdown per database, not an approximation.

Usage Tip: To interpret total CPU percentage, compare with your system’s total cores × clock time over the same interval.

13. Connection Saturation 🟡 Intermediate

Filename: connection_saturation.sql

SELECT count(*) AS total_connections,
       count(*) FILTER (WHERE state = 'active') AS active_connections,
       setting::int AS max_connections,
       round(100.0 * count(*) / setting::int, 2) AS pct_used
FROM pg_stat_activity, pg_settings
WHERE name = 'max_connections'
GROUP BY setting;

Alert: 80% usage → tune max_connections or add pooling.

14. Full I/O Cache Efficiency (tables + indexes + toast) 🔴 Advanced

Filename: io_wait_monitor.sql

SELECT relname AS table_name,
       heap_blks_hit + idx_blks_hit + toast_blks_hit + tidx_blks_hit AS blks_hit,
       heap_blks_read + idx_blks_read + toast_blks_read + tidx_blks_read AS blks_read,
       round(
         100 * (heap_blks_hit + idx_blks_hit + toast_blks_hit + tidx_blks_hit)
         / nullif(
             (heap_blks_hit + idx_blks_hit + toast_blks_hit + tidx_blks_hit)
             + (heap_blks_read + idx_blks_read + toast_blks_read + tidx_blks_read), 0
           ),
         2
       ) AS io_cache_efficiency
FROM pg_statio_user_tables
ORDER BY io_cache_efficiency ASC
LIMIT 10;

Explanation:

heap_blks_hit/read → data blocks

idx_blks_hit/read → index blocks

toast_blks_hit/read → TOAST (large-value) blocks

tidx_blks_hit/read → TOAST index blocks

This version gives a true total I/O cache efficiency across all block types. A value close to 100% means most reads are served from memory (shared buffers), not disk.

15. Deadlocks 🔴 Advanced

Filename: deadlocks.sql

SELECT pid, usename, query, state_change
FROM pg_stat_activity
WHERE wait_event_type = 'Lock' AND wait_event = 'deadlock';

🚨 Tip: Track with log_lock_waits = on for continuous monitoring.

16. Top CPU-Intensive Queries (requires pg_stat_statements) 🔴 Advanced

Filename: top_cpu_queries.sql

SELECT query, total_exec_time, calls, mean_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

🔥 Insight: Helps identify resource-hungry SQL statements.

17. Load Average from Extensions (pg_stat_kcache) 🔴 Advanced

Filename: pg_stat_kcache_load.sql

SELECT round(sum(cpu_user_time + cpu_sys_time) / 1000, 2) AS cpu_msec_total
FROM pg_stat_kcache;

📊 Note: Requires pg_stat_kcache. Great for tracking real CPU time per query.

18. Temporary Files Usage 🟡 Intermediate

Filename: temp_file_usage.sql

SELECT datname, temp_files, temp_bytes,
       pg_size_pretty(temp_bytes) AS pretty_size
FROM pg_stat_database
ORDER BY temp_bytes DESC;

🧊 Tip: High temp usage → low work_mem.

19. Checkpoint Frequency 🟡 Intermediate

Filename: checkpoint_frequency.sql

SELECT checkpoints_timed, checkpoints_req, stats_reset
FROM pg_stat_bgwriter;

📈 Observation: Too frequent checkpoints = I/O stress.

20. WAL Growth Rate 🟡 Intermediate

Filename: wal_growth_rate.sql

SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS wal_growth
FROM pg_replication_slots;

📦 Explanation: Tracks WAL growth for replication and disk usage tuning.

✅ Conclusion

These 20 scripts form a comprehensive PostgreSQL health check toolkit. They cover CPU, I/O, replication, locks, memory, and more.

⚙️ Use them manually or integrate with your observability platform.

🚀 Or try pghealth.io for automated, real-time PostgreSQL health monitoring — with scoring, alerts, and insights.

👉 Start Free Trial

Top 20 PostgreSQL Health Check Scripts

20 powerful PostgreSQL health check scripts for CPU, load, performance, replication, and storage diagnostics. Includes advanced SQL examples and usage tips.