Query Tuning / Genetic Query Optimizer

Master PostgreSQL GEQO parameters for optimizing complex query planning. Learn genetic algorithm settings, join ordering, and threshold configuration for large query optimization.

geqo

  • What it does: Enables or disables the Genetic Query Optimizer (GEQO) algorithm for planning queries with many joins.
  • Why it matters: GEQO uses a genetic algorithm to find efficient join orders for complex queries without exhaustively searching all possibilities. This dramatically reduces planning time for queries involving many tables, preventing the exponential growth of planning time that occurs with traditional exhaustive search methods. For queries with many joins, GEQO can mean the difference between milliseconds and hours of planning time.
  • Ideal value & Best Practice: Default on is recommended. Disable only if you experience suboptimal plans for complex queries and want to force exhaustive search. For most workloads with queries involving 12+ tables, GEQO provides significantly better planning performance with minimal plan quality trade-offs.

geqo_effort

  • What it does: Controls the overall effort GEQO puts into query planning by influencing other GEQO parameters.
  • Why it matters: This master parameter provides a simple way to balance planning time against plan quality. Higher values increase both planning time and the likelihood of finding better plans. It automatically adjusts geqo_pool_size, geqo_generations, and other parameters to provide a balanced configuration.
  • Ideal value & Best Practice: Default 5 provides a good balance. Increase to 7-10 for data warehouse environments where planning time is less critical than execution time. Decrease to 2-3 for OLTP systems where fast planning is essential. Monitor planning time versus execution time to find the optimal balance.

geqo_generations

  • What it does: Sets the number of iterations (generations) the genetic algorithm will run.
  • Why it matters: More generations allow the algorithm to explore more potential solutions, potentially finding better plans but increasing planning time. Each generation represents a complete cycle of selection, crossover, and mutation in the genetic algorithm.
  • Ideal value & Best Practice: Default 0 (auto-set based on geqo_effort). If setting manually, start with 100-200 generations. Increase for very complex queries (500+) if planning time is acceptable. Monitor the trade-off between additional planning time and improved execution time.

geqo_pool_size

  • What it does: Sets the number of individuals (potential solutions) in the genetic algorithm population.
  • Why it matters: Larger populations provide more genetic diversity and increase the chance of finding good solutions, but also increase memory usage and planning time. The population size represents the number of different join orders considered in each generation.
  • Ideal value & Best Practice: Default 0 (auto-set based on geqo_effort). If setting manually, use 50-200 individuals. Larger values (300-500) may help for extremely complex queries but significantly increase planning time. Scale based on the number of tables in your typical complex queries.

geqo_seed

  • What it does: Sets the random seed used by the genetic algorithm's random number generator.
  • Why it matters: The random seed affects the initial population and random decisions during evolution. Using a fixed seed can make GEQO behavior deterministic, which is valuable for reproducing and debugging plan changes. Different seeds may produce different plans for the same query.
  • Ideal value & Best Practice: Default 0 (uses current time, making behavior non-deterministic). Set to a fixed value (e.g., 0.5) when debugging plan instability issues. For production, the default is usually best as it provides variety in plan exploration.

geqo_selection_bias

  • What it does: Controls the selective pressure in the genetic algorithmβ€”how strongly better solutions are favored during selection.
  • Why it matters: Higher bias favors fitter individuals more aggressively, potentially converging faster but risking premature convergence on suboptimal solutions. Lower bias maintains more diversity but may take longer to converge. This parameter affects the balance between exploration and exploitation in the search process.
  • Ideal value & Best Practice: Default 2.0 is generally effective. Increase to 3.0-4.0 for faster convergence when you trust the initial population quality. Decrease to 1.5-2.0 for more diverse exploration in complex search spaces.

geqo_threshold

  • What it does: Sets the minimum number of FROM items (tables) required before GEQO is used instead of exhaustive search.
  • Why it matters: This threshold determines when PostgreSQL switches from traditional planning to genetic algorithm planning. Setting it too low may use GEQO for simple queries where exhaustive search is better. Setting it too high may cause very long planning times for complex queries.
  • Ideal value & Best Practice: Default 12 is reasonable for most workloads. For OLTP systems with simple queries, consider increasing to 15-20. For data warehouses with complex analytics, decrease to 8-10 to engage GEQO earlier. Monitor planning times for queries near your threshold.

Try pghealth Free Today πŸš€

Start your journey toward a healthier PostgreSQL with pghealth.
You can explore all features immediately with a free trial β€” no installation required.

πŸ‘‰ Start Free Trial