Top 20 PostgreSQL Health Check Scripts

20 powerful PostgreSQL health check scripts for CPU, load, performance, replication, and storage diagnostics. Includes advanced SQL examples and usage tips.

Β·7 minutes reading
Cover Image for Top 20 PostgreSQL Health Check Scripts

🧠 Top 20 PostgreSQL Health Check Scripts (with Explanations)

PostgreSQL performs beautifully β€” but only when it’s healthy.
This collection of 20 advanced health check scripts helps you monitor CPU usage, locks, replication, I/O, cache, and more.

Each script includes its skill level, filename, and usage tips.
You can automate or schedule them to run daily for continuous insights.


1. Database Size Overview 🟒 Beginner

Filename: db_size_overview.sql

Usage: List total size of all databases.

SELECT datname AS database_name,
       pg_size_pretty(pg_database_size(datname)) AS size
FROM pg_database
ORDER BY pg_database_size(datname) DESC;

Explanation: Quickly identifies which databases consume the most storage.

2. Largest Tables 🟒 Beginner

Filename: largest_tables.sql

Usage: Detect tables growing too fast.

SELECT relname AS table_name,
       pg_size_pretty(pg_total_relation_size(relid)) AS total_size
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC
LIMIT 20;

Tip: Big tables may need partitioning or archiving.

3. Index Usage Efficiency 🟑 Intermediate

Filename: index_usage_efficiency.sql

Usage: Evaluate how often indexes are used.

SELECT relname AS table_name,
       100 * idx_scan / (seq_scan + idx_scan + 1) AS index_usage_pct,
       idx_scan, seq_scan
FROM pg_stat_user_tables
ORDER BY index_usage_pct ASC
LIMIT 20;

Insight: Low index usage (<30%) β†’ add or tune indexes.

4. Unused Indexes 🟑 Intermediate

Filename: unused_indexes.sql

Usage: Find indexes that are never used.

SELECT schemaname, relname AS table_name, indexrelname AS index_name, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY relname;

Recommendation: Remove unused indexes to speed up writes.

5. Table Bloat Estimate 🟑 Intermediate

Filename: table_bloat_estimate.sql

Usage: Detect wasted space due to dead tuples.

SELECT schemaname, relname,
       pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) AS wasted_space
FROM pg_catalog.pg_statio_user_tables
ORDER BY (pg_total_relation_size(relid) - pg_relation_size(relid)) DESC
LIMIT 10;

Explanation: High bloat = vacuum/autovacuum misconfiguration.

6. Autovacuum Activity 🟒 Beginner

Filename: autovacuum_activity.sql

SELECT relname AS table_name, n_tup_ins, n_tup_upd, n_tup_del,
       n_dead_tup, last_autovacuum
FROM pg_stat_all_tables
ORDER BY n_dead_tup DESC
LIMIT 10;

Tip: Old autovacuum timestamps = potential bloat risk.

7. Long-Running Queries 🟑 Intermediate

Filename: long_running_queries.sql

SELECT pid, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active' AND now() - query_start > interval '5 minutes'
ORDER BY duration DESC;

Explanation: Long queries block vacuum and slow down others.

8. Blocking Queries πŸ”΄ Advanced

Filename: blocking_queries.sql

SELECT blocked.pid AS blocked_pid,
       blocked.query AS blocked_query,
       blocking.pid AS blocking_pid,
       blocking.query AS blocking_query,
       now() - blocked.query_start AS blocked_duration
FROM pg_locks blocked_locks
JOIN pg_stat_activity blocked ON blocked.pid = blocked_locks.pid
JOIN pg_locks blocking_locks
  ON blocking_locks.transactionid = blocked_locks.transactionid
 AND blocking_locks.pid != blocked_locks.pid
JOIN pg_stat_activity blocking ON blocking.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;

Insight: Identifies who’s blocking whom β€” great for debugging β€œhangs.”

9. Replication Lag 🟑 Intermediate

Filename: replication_lag.sql

SELECT application_name, state, sync_state,
       pg_current_wal_lsn() - replay_lsn AS replication_lag
FROM pg_stat_replication;

Tip: Monitor replica lag to prevent stale reads.

10. Cache Hit Ratio 🟒 Beginner

Filename: cache_hit_ratio.sql

SELECT sum(blks_hit) / nullif(sum(blks_hit) + sum(blks_read), 0) AS cache_hit_ratio
FROM pg_stat_database;

Ideal: >99% cache hit ratio = efficient memory usage.

11. CPU Usage per Backend πŸ”΄ Advanced

Filename: cpu_usage_per_backend.sql

SELECT pid, usename, application_name,
       round((EXTRACT(EPOCH FROM (now() - backend_start))/60)::numeric, 2) AS runtime_min,
       state, query
FROM pg_stat_activity
WHERE now() - backend_start > interval '1 minute'
ORDER BY runtime_min DESC;

Note: PostgreSQL doesn’t directly expose CPU%, but long runtimes often correlate with CPU-heavy tasks.

12. OS CPU Load (Accurate via pg_stat_kcache)

Filename: os_cpu_load.sql

Level: πŸ”΄ Advanced (requires pg_stat_kcache)

Requires: CREATE EXTENSION pg_stat_kcache;

SELECT datname,
       round(sum(cpu_user_time + cpu_sys_time) / 1000, 2) AS total_cpu_msec,
       round(sum(cpu_user_time) / 1000, 2) AS user_cpu_msec,
       round(sum(cpu_sys_time) / 1000, 2) AS sys_cpu_msec,
       sum(reads) AS total_reads,
       sum(writes) AS total_writes
FROM pg_stat_kcache
JOIN pg_database db ON db.oid = pg_stat_kcache.datid
GROUP BY datname
ORDER BY total_cpu_msec DESC
LIMIT 10;

Explanation:

cpu_user_time: β†’ User-space CPU time used by backend queries.

cpu_sys_time: β†’ Kernel-level CPU time used by PostgreSQL processes.

reads/writes: β†’ I/O operations. This gives a real CPU usage breakdown per database, not an approximation.

Usage Tip: To interpret total CPU percentage, compare with your system’s total cores Γ— clock time over the same interval.

13. Connection Saturation 🟑 Intermediate

Filename: connection_saturation.sql

SELECT count(*) AS total_connections,
       count(*) FILTER (WHERE state = 'active') AS active_connections,
       setting::int AS max_connections,
       round(100.0 * count(*) / setting::int, 2) AS pct_used
FROM pg_stat_activity, pg_settings
WHERE name = 'max_connections'
GROUP BY setting;

Alert: 80% usage β†’ tune max_connections or add pooling.

14. Full I/O Cache Efficiency (tables + indexes + toast) πŸ”΄ Advanced

Filename: io_wait_monitor.sql

SELECT relname AS table_name,
       heap_blks_hit + idx_blks_hit + toast_blks_hit + tidx_blks_hit AS blks_hit,
       heap_blks_read + idx_blks_read + toast_blks_read + tidx_blks_read AS blks_read,
       round(
         100 * (heap_blks_hit + idx_blks_hit + toast_blks_hit + tidx_blks_hit)
         / nullif(
             (heap_blks_hit + idx_blks_hit + toast_blks_hit + tidx_blks_hit)
             + (heap_blks_read + idx_blks_read + toast_blks_read + tidx_blks_read), 0
           ),
         2
       ) AS io_cache_efficiency
FROM pg_statio_user_tables
ORDER BY io_cache_efficiency ASC
LIMIT 10;

Explanation:

heap_blks_hit/read β†’ data blocks

idx_blks_hit/read β†’ index blocks

toast_blks_hit/read β†’ TOAST (large-value) blocks

tidx_blks_hit/read β†’ TOAST index blocks

This version gives a true total I/O cache efficiency across all block types. A value close to 100% means most reads are served from memory (shared buffers), not disk.

15. Deadlocks πŸ”΄ Advanced

Filename: deadlocks.sql

SELECT pid, usename, query, state_change
FROM pg_stat_activity
WHERE wait_event_type = 'Lock' AND wait_event = 'deadlock';

🚨 Tip: Track with log_lock_waits = on for continuous monitoring.

16. Top CPU-Intensive Queries (requires pg_stat_statements) πŸ”΄ Advanced

Filename: top_cpu_queries.sql

SELECT query, total_exec_time, calls, mean_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

πŸ”₯ Insight: Helps identify resource-hungry SQL statements.

17. Load Average from Extensions (pg_stat_kcache) πŸ”΄ Advanced

Filename: pg_stat_kcache_load.sql

SELECT round(sum(cpu_user_time + cpu_sys_time) / 1000, 2) AS cpu_msec_total
FROM pg_stat_kcache;

πŸ“Š Note: Requires pg_stat_kcache. Great for tracking real CPU time per query.

18. Temporary Files Usage 🟑 Intermediate

Filename: temp_file_usage.sql

SELECT datname, temp_files, temp_bytes,
       pg_size_pretty(temp_bytes) AS pretty_size
FROM pg_stat_database
ORDER BY temp_bytes DESC;

🧊 Tip: High temp usage β†’ low work_mem.

19. Checkpoint Frequency 🟑 Intermediate

Filename: checkpoint_frequency.sql

SELECT checkpoints_timed, checkpoints_req, stats_reset
FROM pg_stat_bgwriter;

πŸ“ˆ Observation: Too frequent checkpoints = I/O stress.

20. WAL Growth Rate 🟑 Intermediate

Filename: wal_growth_rate.sql

SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS wal_growth
FROM pg_replication_slots;

πŸ“¦ Explanation: Tracks WAL growth for replication and disk usage tuning.

βœ… Conclusion

These 20 scripts form a comprehensive PostgreSQL health check toolkit. They cover CPU, I/O, replication, locks, memory, and more.

βš™οΈ Use them manually or integrate with your observability platform.

πŸš€ Or try pghealth.io for automated, real-time PostgreSQL health monitoring β€” with scoring, alerts, and insights.

πŸ‘‰ Start Free Trial