Replication / Sending Servers

Master PostgreSQL replication sender parameters. Learn optimal settings for replication slots, WAL senders, WAL retention, and commit timestamp tracking for robust replication.

max_replication_slots

  • What it does: Sets the maximum number of replication slots that can be simultaneously defined and active on the primary server.
  • Why it matters: Replication slots are essential for preventing premature removal of WAL files that are needed by standbys or logical decoding clients. Each physical replica and logical replication subscriber typically requires one slot. Setting this too low will prevent new replicas from connecting or cause replication failures. Setting it too high wastes minimal resources as unused slots consume very little memory.
  • Ideal value & Best Practice: Set to at least the number of physical standbys plus logical replication subscribers, plus 2-3 for overhead and future growth. For example, with 2 physical standbys and 1 logical subscriber, use 6. Monitor slot usage and adjust as your replication topology expands.

max_slot_wal_keep_size

  • What it does: Sets the maximum amount of WAL data that can be retained for replication slots before they are automatically marked as failed.
  • Why it matters: This parameter protects the primary server from excessive disk space usage when replicas fall significantly behind or become disconnected for extended periods. Without this limit, a disconnected replica could cause unbounded WAL accumulation on the primary. However, setting it too low may cause unnecessary replication failures during temporary network issues or maintenance windows.
  • Ideal value & Best Practice: Default -1 (unlimited) is reasonable for most environments with good monitoring. For space-constrained systems, set to a value like 100GB or 200GB based on your available disk space and replication requirements. Ensure this provides enough buffer for expected replication lag periods.

max_wal_senders

  • What it does: Sets the maximum number of simultaneous WAL sender processes that can handle replication connections to standby servers.
  • Why it matters: Each connected replica requires one WAL sender process. Setting this too low will prevent additional replicas from connecting, potentially affecting high availability or scalability. Setting it too high consumes shared memory resources, though idle WAL senders use minimal resources.
  • Ideal value & Best Practice: Set to the number of expected simultaneous replica connections plus 2-3 for overhead. For example, with 3 standbys, use 6. Consider future scaling needs and ensure adequate shared memory is allocated for the maximum number of senders.

track_commit_timestamp

  • What it does: Controls whether PostgreSQL collects and stores transaction commit timestamps, which are essential for point-in-time recovery and certain types of logical replication.
  • Why it matters: Commit timestamps enable advanced features like tracking when specific transactions committed, which is crucial for troubleshooting, auditing, and some replication scenarios. However, enabling this feature adds a small amount of overhead to transaction processing and requires additional storage.
  • Ideal value & Best Practice: Default off is appropriate unless you specifically need commit timestamps. Enable (on) if using features that require commit timestamps, such as certain logical replication setups or point-in-time recovery tools. Test performance impact in your environment before enabling in production.

wal_keep_size

  • What it does: Sets the minimum amount of WAL data to retain for standby servers, regardless of replication slot status.
  • Why it matters: This provides a safety net for replicas that don't use replication slots or as extra protection alongside slots. It ensures that standbys have a window of time to reconnect before WAL segments are removed. This is particularly important for environments where replicas may experience temporary connectivity issues.
  • Ideal value & Best Practice: Default 0 (no additional retention) is sufficient when using replication slots. Set to a value like 1GB or 5GB as extra protection against replica lag. Balance between protection and disk space usage based on your network reliability and replica performance.

wal_sender_timeout

  • What it does: Sets the maximum time to wait for a response from a standby server before considering the connection failed.
  • Why it matters: This timeout prevents WAL sender processes from hanging indefinitely when standbys become unresponsive due to network issues, standby crashes, or performance problems. Proper tuning ensures timely detection of replication issues while avoiding false positives during temporary network latency.
  • Ideal value & Best Practice: Default 60s is reasonable for most LAN environments. Increase to 120s or 300s for WAN connections with higher latency. Decrease to 30s for environments requiring rapid failure detection. Monitor replication status and adjust based on observed network conditions.

Try pghealth Free Today πŸš€

Start your journey toward a healthier PostgreSQL with pghealth.
You can explore all features immediately with a free trial β€” no installation required.

πŸ‘‰ Start Free Trial