Danube Data - Data, on the right course

PostgreSQL is a powerful relational database, but out-of-the-box settings rarely deliver optimal performance for production workloads. This comprehensive guide covers proven techniques to tune your PostgreSQL database for speed, efficiency, and scalability.

Why PostgreSQL Performance Matters

Database performance directly impacts user experience, infrastructure costs, and developer productivity. A well-tuned PostgreSQL instance can:

Handle 10x more concurrent users with the same hardware
Reduce query response times from seconds to milliseconds
Lower server costs by better utilizing existing resources
Prevent outages caused by connection exhaustion or lock contention
Improve developer velocity with faster test suite execution

Understanding PostgreSQL Architecture

Before tuning, it is essential to understand how PostgreSQL manages resources:

Memory Architecture

Shared buffers: Cache for frequently accessed table and index data
Work memory: Per-operation memory for sorting and hashing
Maintenance work memory: Memory for VACUUM, CREATE INDEX, etc.
Effective cache size: Hint to planner about total available memory
WAL buffers: Write-Ahead Log buffer for transaction commits

Process Model

PostgreSQL uses a process-per-connection model. Each connection spawns a new backend process, which has implications for memory usage and connection pooling.

Essential Configuration Tuning

Memory Settings

The most impactful settings for query performance:

-- postgresql.conf

-- For a server with 16GB RAM:
shared_buffers = 4GB              # 25% of total RAM
effective_cache_size = 12GB       # 75% of total RAM
work_mem = 64MB                   # Per operation memory
maintenance_work_mem = 1GB        # For index creation, VACUUM
wal_buffers = 16MB                # WAL buffer size

-- For 32GB RAM:
shared_buffers = 8GB
effective_cache_size = 24GB
work_mem = 128MB
maintenance_work_mem = 2GB
wal_buffers = 32MB

-- For 64GB RAM:
shared_buffers = 16GB
effective_cache_size = 48GB
work_mem = 256MB
maintenance_work_mem = 4GB
wal_buffers = 64MB

Important: Set shared_buffers to 25% of RAM for dedicated database servers. Higher values can hurt performance due to double buffering with the OS cache.

Checkpoint and WAL Settings

Prevent checkpoint spikes that cause write latency:

-- Spread checkpoints over time
checkpoint_completion_target = 0.9
wal_compression = on

-- Increase WAL size for write-heavy workloads
max_wal_size = 4GB
min_wal_size = 1GB

-- Archive settings (for replication/PITR)
archive_mode = on
archive_command = 'gzip < %p > /var/lib/postgresql/wal_archive/%f.gz'

Query Planner Settings

-- Help planner make better decisions
random_page_cost = 1.1            # For SSD storage (default is 4.0)
effective_io_concurrency = 200    # Concurrent I/O operations
default_statistics_target = 100   # Sample size for ANALYZE (default 100)

-- Enable parallel query execution
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
max_worker_processes = 8

SSD Tuning: Lower random_page_cost to 1.1-1.5 for SSDs (vs 4.0 for spinning disks) to encourage index usage.

Connection Settings

-- Max connections (use connection pooler instead of high values)
max_connections = 200

-- Connection timeout
statement_timeout = 30000         # Kill queries after 30 seconds
idle_in_transaction_session_timeout = 60000  # 1 minute

Indexing Strategies

Proper indexing is the single most impactful optimization technique.

B-tree Indexes (Default)

Best for equality and range queries:

-- Single column index
CREATE INDEX idx_users_email ON users(email);

-- Composite index (order matters!)
CREATE INDEX idx_orders_user_created ON orders(user_id, created_at DESC);

-- Covering index (include additional columns)
CREATE INDEX idx_users_email_covering ON users(email) INCLUDE (name, created_at);

Partial Indexes

Index only rows that match a condition (smaller, faster):

-- Only index active users
CREATE INDEX idx_users_active_email ON users(email) WHERE status = 'active';

-- Only index recent orders
CREATE INDEX idx_orders_recent ON orders(created_at)
WHERE created_at > '2025-01-01';

-- Only index non-null values
CREATE INDEX idx_users_deleted_at ON users(deleted_at)
WHERE deleted_at IS NOT NULL;

Expression Indexes

Index computed values for faster queries:

-- Index lowercase email for case-insensitive search
CREATE INDEX idx_users_lower_email ON users(LOWER(email));

-- Query using the indexed expression
SELECT * FROM users WHERE LOWER(email) = LOWER('user@example.com');

-- Index for full-text search
CREATE INDEX idx_articles_search ON articles USING GIN(to_tsvector('english', title || ' ' || content));

GIN and GiST Indexes

Specialized indexes for complex data types:

-- GIN for full-text search
CREATE INDEX idx_posts_fts ON posts USING GIN(to_tsvector('english', content));

-- GIN for JSONB queries
CREATE INDEX idx_users_metadata ON users USING GIN(metadata jsonb_path_ops);

-- GiST for geometric queries
CREATE INDEX idx_locations_point ON locations USING GIST(coordinates);

Index Maintenance

-- Find unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexname NOT LIKE '%_pkey';

-- Find duplicate indexes
SELECT pg_size_pretty(SUM(pg_relation_size(idx))::BIGINT) AS size,
       (array_agg(idx))[1] AS idx1, (array_agg(idx))[2] AS idx2
FROM (
    SELECT indexrelid::regclass AS idx, indrelid, indkey::text
    FROM pg_index
) sub
GROUP BY indrelid, indkey HAVING COUNT(*) > 1;

-- Rebuild bloated indexes
REINDEX INDEX CONCURRENTLY idx_users_email;

Query Optimization

Using EXPLAIN ANALYZE

Understand query execution plans:

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2025-01-01'
GROUP BY u.id, u.name
ORDER BY order_count DESC
LIMIT 10;

-- Output shows:
-- - Actual execution time
-- - Rows processed vs estimated
-- - Index usage (Seq Scan vs Index Scan)
-- - Buffer hits (cache efficiency)
-- - Sort/Hash operations

Common Query Anti-Patterns

Problem: SELECT * instead of specific columns

-- Bad: Fetches unnecessary data
SELECT * FROM users WHERE email = 'user@example.com';

-- Good: Only fetch needed columns
SELECT id, email, name FROM users WHERE email = 'user@example.com';

Problem: N+1 queries

-- Bad: Query in a loop (N+1 problem)
-- Fetch users, then loop and query orders for each user

-- Good: Use JOIN or subquery
SELECT u.*,
       (SELECT COUNT(*) FROM orders o WHERE o.user_id = u.id) as order_count
FROM users u
WHERE u.status = 'active';

Problem: OFFSET for pagination

-- Bad: Slow for large offsets
SELECT * FROM orders ORDER BY created_at DESC OFFSET 100000 LIMIT 20;

-- Good: Cursor-based pagination
SELECT * FROM orders
WHERE created_at < '2025-12-01 10:30:00'
ORDER BY created_at DESC
LIMIT 20;

Optimizing JOINs

-- Use EXISTS instead of IN for large subqueries
-- Bad:
SELECT * FROM users WHERE id IN (SELECT user_id FROM orders WHERE total > 1000);

-- Good:
SELECT * FROM users u WHERE EXISTS (
    SELECT 1 FROM orders o WHERE o.user_id = u.id AND o.total > 1000
);

-- Use appropriate JOIN types
-- INNER JOIN: Only matching rows
-- LEFT JOIN: All left rows + matching right rows
-- Avoid CROSS JOIN unless intentional

Common Table Expressions (CTEs)

-- Break complex queries into readable parts
WITH active_users AS (
    SELECT id, email FROM users WHERE status = 'active'
),
recent_orders AS (
    SELECT user_id, SUM(total) as total_spent
    FROM orders
    WHERE created_at > NOW() - INTERVAL '30 days'
    GROUP BY user_id
)
SELECT u.email, COALESCE(o.total_spent, 0) as spent
FROM active_users u
LEFT JOIN recent_orders o ON o.user_id = u.id
ORDER BY spent DESC
LIMIT 100;

Connection Pooling with PgBouncer

Reduce connection overhead and support more concurrent users:

Why Connection Pooling?

Each PostgreSQL connection uses 10-20MB of RAM
Connection creation takes 5-10ms
Most applications need 100-1000+ concurrent connections
PgBouncer uses < 2KB per pooled connection

PgBouncer Installation

# Install PgBouncer
sudo apt-get install pgbouncer

# Configure /etc/pgbouncer/pgbouncer.ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp_production

[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3

# Start PgBouncer
sudo systemctl enable pgbouncer
sudo systemctl start pgbouncer

Pool Modes

Mode	When Released	Use Case
Session	Client disconnects	Apps using temp tables, prepared statements
Transaction	Transaction commits	Most web applications (recommended)
Statement	Every statement	Read-only, simple queries

Application Configuration

# Change your app connection string
# From:
DATABASE_URL=postgresql://user:pass@localhost:5432/myapp

# To:
DATABASE_URL=postgresql://user:pass@localhost:6432/myapp

VACUUM and Autovacuum Tuning

PostgreSQL uses MVCC, which creates dead tuples that must be cleaned up:

Autovacuum Settings

-- postgresql.conf
autovacuum = on
autovacuum_max_workers = 4
autovacuum_naptime = 15s          # Check for work every 15 seconds

-- Aggressive settings for write-heavy tables
autovacuum_vacuum_scale_factor = 0.05   # Vacuum when 5% of rows are dead
autovacuum_analyze_scale_factor = 0.02  # Analyze when 2% changed

-- Prevent autovacuum from being blocked
autovacuum_vacuum_cost_limit = 2000
autovacuum_vacuum_cost_delay = 10ms

Manual VACUUM

-- Regular vacuum (can run concurrently)
VACUUM ANALYZE users;

-- Full vacuum (locks table, reclaims space)
VACUUM FULL users;

-- Check table bloat
SELECT schemaname, tablename,
       pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
       n_dead_tup, n_live_tup,
       ROUND(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Monitoring and Metrics

Essential pg_stat Views

-- Find slow queries
SELECT query, calls, total_exec_time, mean_exec_time, max_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;

-- Check cache hit ratio (should be > 99%)
SELECT
    SUM(heap_blks_read) as heap_read,
    SUM(heap_blks_hit)  as heap_hit,
    ROUND(SUM(heap_blks_hit) * 100.0 / NULLIF(SUM(heap_blks_hit) + SUM(heap_blks_read), 0), 2) AS cache_hit_ratio
FROM pg_statio_user_tables;

-- Check index usage
SELECT schemaname, tablename,
       indexname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC;

-- Active connections
SELECT state, COUNT(*)
FROM pg_stat_activity
GROUP BY state;

Enable pg_stat_statements

-- Add to postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 10000

-- Restart PostgreSQL
sudo systemctl restart postgresql

-- Create extension
CREATE EXTENSION pg_stat_statements;

Key Metrics to Monitor

Cache hit ratio: Should be > 99% (< 1% disk reads)
Checkpoint frequency: Not more than once per 5 minutes
Transaction wraparound: Monitor age of oldest transaction
Replication lag: For read replicas (should be < 1 second)
Connection count: Should be well below max_connections
Lock contention: Waiting queries indicate locking issues

DanubeData Managed PostgreSQL

All these optimizations are pre-configured in DanubeData managed PostgreSQL instances:

What is Included

Auto-tuned configuration: Optimal settings based on instance size
Built-in PgBouncer: Connection pooling enabled by default
pg_stat_statements: Pre-installed for query monitoring
Automated VACUUM: Aggressive autovacuum for high-traffic tables
Read replicas: One-click replication for read scaling
Point-in-time recovery: Restore to any point in the last 7 days
Automated backups: Daily snapshots with offsite storage
Monitoring dashboard: Real-time metrics for cache hits, slow queries, connections

Pricing

Plan	vCPU	RAM	Storage	Price/Month
Starter	1	2GB	20GB	$8.99
Growth	2	4GB	50GB	$17.99
Business	4	8GB	100GB	$35.99
Enterprise	8	16GB	200GB	$71.99

Performance Testing

pgbench for Benchmarking

# Initialize test database
pgbench -i -s 50 myapp_test

# Run benchmark (10 clients, 10 threads, 60 seconds)
pgbench -c 10 -j 10 -T 60 myapp_test

# Custom SQL script
cat > bench.sql << EOF
\set user_id random(1, 1000000)
SELECT * FROM users WHERE id = :user_id;
EOF

pgbench -c 50 -j 4 -T 300 -f bench.sql myapp_production

Load Testing

Use realistic workloads to test performance:

Simulate production query patterns
Test with production data volumes
Monitor metrics during load tests
Identify bottlenecks before they hit production

Quick Wins Checklist

Start with these high-impact optimizations:

Set shared_buffers to 25% of RAM (restart required)
Lower random_page_cost to 1.1 for SSDs
Enable pg_stat_statements to find slow queries
Add indexes on foreign keys and WHERE clause columns
Deploy PgBouncer for connection pooling
Configure autovacuum for write-heavy tables
Use EXPLAIN ANALYZE on your slowest queries
Check cache hit ratio (should be > 99%)
Remove unused indexes to speed up writes
Set statement_timeout to prevent runaway queries

Get Started with Optimized PostgreSQL

Skip the manual tuning process. DanubeData provides fully optimized PostgreSQL instances with all best practices pre-configured:

Create your free account and deploy PostgreSQL in under 60 seconds
Choose from PostgreSQL 16, 15, or 14 with optimized settings
Built-in monitoring dashboard with query performance insights
One-click read replicas for horizontal scaling
Automated backups with point-in-time recovery
Start at $8.99/month with 2GB RAM and 20GB storage

Questions? Check our PostgreSQL documentation or contact support for help optimizing your database.

Get Started Free

Compute

Caching

Databases

Storage

Developers

Developer Docs

Learn

Support

PostgreSQL Performance Tuning: A Practical Guide for Production Databases