Documentation

Redis Replicas

Redis replicas provide high availability, improved read performance, and automatic failover for your managed Redis instances. This guide covers replica configuration, management, and best practices.

Overview

Redis replicas create read-only copies of your primary Redis instance:

  • High Availability: Automatic failover if primary fails
  • Read Scaling: Distribute read operations across replicas
  • Disaster Recovery: Maintain standby instance for recovery
  • Zero Data Loss: Synchronous replication available
  • Automatic Promotion: Replica automatically becomes primary on failure

How Redis Replication Works

Replication Architecture

┌─────────────┐        Async/Sync Replication      ┌─────────────┐
│   Primary   │ ────────────────────────────────> │   Replica   │
│ (Read/Write)│                                    │ (Read-Only) │
└─────────────┘                                    └─────────────┘
       │                                                    │
       │                                                    │
  Write Ops                                           Read Ops

Replication Process

  1. Initial Sync: Replica requests full copy from primary
  2. RDB Transfer: Primary sends snapshot to replica
  3. Command Stream: Primary streams write commands to replica
  4. Apply Commands: Replica applies commands in order
  5. Stay Synchronized: Continuous replication of all writes

Replication Lag

  • Asynchronous: Typically < 10ms lag
  • Synchronous: Zero lag (available for critical data)
  • Monitoring: Lag displayed in dashboard
  • Automatic Catch-up: Replicas automatically sync after disconnection

Creating Replicas

Prerequisites

  • Existing Redis instance (primary)
  • Instance must be in healthy state
  • Sufficient account limits for additional resources

Via Dashboard

  1. Navigate to your Redis instance
  2. Click Replicas tab
  3. Click Create Replica
  4. Configure replica:
    • Name: Descriptive name
    • Region: Same or different data center
    • Profile: Match or differ from primary
    • Replication Mode: Asynchronous or Synchronous
  5. Click Create

Replica will be ready within 2-5 minutes.

Replica Configuration

Same-Region Replicas

Best for:

  • High availability within region
  • Read scaling
  • Minimal replication lag (< 10ms)
  • Lower cost

Cross-Region Replicas

Best for:

  • Disaster recovery
  • Geographic distribution
  • Compliance requirements
  • Serving users in different regions
  • Higher lag (~50-100ms depending on distance)

Connecting to Replicas

Connection Endpoints

Each replica has its own endpoint:

Primary:   redis-123456.danubedata.com:6379
Replica 1: redis-replica-123456-01.danubedata.com:6379
Replica 2: redis-replica-123456-02.danubedata.com:6379

Read-Only Access

Replicas are read-only by default:

import redis

# Primary - read/write
primary = redis.Redis(
    host='redis-123456.danubedata.com',
    port=6379,
    password='password',
    ssl=True
)

# Replica - read-only
replica = redis.Redis(
    host='redis-replica-123456-01.danubedata.com',
    port=6379,
    password='password',
    ssl=True
)

# Writes go to primary
primary.set('key', 'value')  # ✓ Works

# Reads can use replica
value = replica.get('key')  # ✓ Works

# Writes to replica will fail
replica.set('key', 'value')  # ✗ Error: READONLY

Application Configuration

Python with Read/Write Splitting

from redis import Redis
import random

class RedisClient:
    def __init__(self, primary_host, replica_hosts):
        self.primary = Redis(host=primary_host, ssl=True, ...)
        self.replicas = [Redis(host=host, ssl=True, ...) for host in replica_hosts]
    
    def get_replica(self):
        """Get random replica for load balancing"""
        return random.choice(self.replicas) if self.replicas else self.primary
    
    def get(self, key):
        """Read from replica"""
        return self.get_replica().get(key)
    
    def set(self, key, value, **kwargs):
        """Write to primary"""
        return self.primary.set(key, value, **kwargs)
    
    def delete(self, key):
        """Delete from primary"""
        return self.primary.delete(key)

# Usage
redis_client = RedisClient(
    primary_host='redis-123456.danubedata.com',
    replica_hosts=[
        'redis-replica-123456-01.danubedata.com',
        'redis-replica-123456-02.danubedata.com',
    ]
)

# Writes to primary
redis_client.set('user:1000', 'John')

# Reads from replica
user = redis_client.get('user:1000')

Laravel Configuration

// config/database.php
'redis' => [
    'client' => env('REDIS_CLIENT', 'phpredis'),
    
    'options' => [
        'cluster' => env('REDIS_CLUSTER', 'redis'),
        'prefix' => env('REDIS_PREFIX', Str::slug(env('APP_NAME', 'laravel'), '_').'_database_'),
    ],

    'default' => [
        'url' => env('REDIS_URL'),
        'host' => env('REDIS_HOST', 'redis-123456.danubedata.com'),
        'password' => env('REDIS_PASSWORD', null),
        'port' => env('REDIS_PORT', '6379'),
        'database' => env('REDIS_DB', '0'),
        'read_write_timeout' => 60,
        'context' => [
            'stream' => [
                'verify_peer' => true,
                'verify_peer_name' => true,
            ],
        ],
    ],

    'replica' => [
        'url' => env('REDIS_REPLICA_URL'),
        'host' => env('REDIS_REPLICA_HOST', 'redis-replica-123456-01.danubedata.com'),
        'password' => env('REDIS_PASSWORD', null),
        'port' => env('REDIS_PORT', '6379'),
        'database' => env('REDIS_DB', '0'),
        'read_write_timeout' => 60,
        'context' => [
            'stream' => [
                'verify_peer' => true,
                'verify_peer_name' => true,
            ],
        ],
    ],
],

// Usage
use Illuminate\Support\Facades\Redis;

// Write to primary
Redis::connection('default')->set('key', 'value');

// Read from replica
$value = Redis::connection('replica')->get('key');

Node.js with Failover

const Redis = require('ioredis');

const primary = new Redis({
  host: 'redis-123456.danubedata.com',
  port: 6379,
  password: 'password',
  tls: {}
});

const replica = new Redis({
  host: 'redis-replica-123456-01.danubedata.com',
  port: 6379,
  password: 'password',
  tls: {},
  retryStrategy(times) {
    // Failover to primary after 3 attempts
    if (times > 3) {
      return null; // Stop retrying
    }
    return Math.min(times * 50, 2000);
  }
});

class RedisManager {
  async get(key) {
    try {
      return await replica.get(key);
    } catch (error) {
      console.log('Replica failed, using primary');
      return await primary.get(key);
    }
  }
  
  async set(key, value) {
    return await primary.set(key, value);
  }
}

module.exports = new RedisManager();

High Availability Configuration

Automatic Failover

Enable automatic failover for production instances:

  1. Navigate to your Redis instance
  2. Click Settings > High Availability
  3. Enable Automatic Failover
  4. Set Failover Timeout (default: 45 seconds)
  5. Click Save

Failover Process

When primary fails:

  1. Detection: Health check detects primary failure (45 seconds)
  2. Verification: Multiple checks confirm failure
  3. Promotion: Best replica promoted to primary
  4. DNS Update: Primary endpoint redirected to new primary
  5. Notification: Email alert sent to account owners
  6. Reconnection: Applications automatically reconnect

Expected Downtime: 30-60 seconds for automatic failover

Manual Failover

Trigger manual failover for maintenance:

  1. Navigate to your Redis instance
  2. Click Replicas tab
  3. Select replica to promote
  4. Click Promote to Primary
  5. Confirm promotion

Manual failover completes within seconds.

Handling Failover in Applications

Connection Retry Logic

import redis
import time
from redis.exceptions import ConnectionError

def redis_operation_with_retry(func, *args, max_retries=3, **kwargs):
    """Execute Redis operation with retry logic"""
    for attempt in range(max_retries):
        try:
            return func(*args, **kwargs)
        except ConnectionError as e:
            if attempt == max_retries - 1:
                raise
            time.sleep(2 ** attempt)  # Exponential backoff
            # Recreate connection
            func.im_self.connection_pool.reset()

# Usage
try:
    redis_operation_with_retry(redis_client.get, 'key')
except ConnectionError:
    # Handle permanent failure
    pass

Circuit Breaker Pattern

class CircuitBreaker {
  constructor(redis, options = {}) {
    this.redis = redis;
    this.failures = 0;
    this.threshold = options.threshold || 5;
    this.timeout = options.timeout || 60000;
    this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN
  }
  
  async execute(operation) {
    if (this.state === 'OPEN') {
      if (Date.now() - this.openedAt > this.timeout) {
        this.state = 'HALF_OPEN';
      } else {
        throw new Error('Circuit breaker is OPEN');
      }
    }
    
    try {
      const result = await operation();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }
  
  onSuccess() {
    this.failures = 0;
    if (this.state === 'HALF_OPEN') {
      this.state = 'CLOSED';
    }
  }
  
  onFailure() {
    this.failures++;
    if (this.failures >= this.threshold) {
      this.state = 'OPEN';
      this.openedAt = Date.now();
    }
  }
}

Monitoring Replicas

Key Metrics

Monitor these metrics for replicas:

  • Replication Lag: Time/bytes behind primary
  • Connected Replicas: Number of connected replicas
  • Replication Offset: Bytes replicated
  • Replica Health: Overall replica status
  • Connection Count: Client connections per replica
  • Memory Usage: Memory consumption per replica

Redis Commands

Check replication status:

# On primary
INFO replication

# Output:
# role:master
# connected_slaves:2
# slave0:ip=10.0.1.5,port=6379,state=online,offset=1234567,lag=0
# slave1:ip=10.0.2.5,port=6379,state=online,offset=1234567,lag=0

# On replica
INFO replication

# Output:
# role:slave
# master_host:redis-123456.danubedata.com
# master_port:6379
# master_link_status:up
# master_last_io_seconds_ago:0
# master_sync_in_progress:0

Monitoring Script

import redis

def check_replication_health(primary_host, replica_hosts):
    """Check replication health for all replicas"""
    primary = redis.Redis(host=primary_host, ...)
    
    # Get primary info
    primary_info = primary.info('replication')
    print(f"Primary: {primary_info['role']}")
    print(f"Connected replicas: {primary_info['connected_slaves']}")
    
    # Check each replica
    for i, replica_host in enumerate(replica_hosts):
        try:
            replica = redis.Redis(host=replica_host, ...)
            info = replica.info('replication')
            
            print(f"\nReplica {i+1}:")
            print(f"  Status: {info['master_link_status']}")
            print(f"  Lag: {info['master_last_io_seconds_ago']}s")
            print(f"  Sync in progress: {info['master_sync_in_progress']}")
        except Exception as e:
            print(f"\nReplica {i+1}: ERROR - {e}")

# Run check
check_replication_health(
    'redis-123456.danubedata.com',
    ['redis-replica-123456-01.danubedata.com', 
     'redis-replica-123456-02.danubedata.com']
)

Replica Management

Scaling Read Capacity

Add more replicas to scale reads:

  1. Create additional replicas
  2. Update application configuration with new endpoints
  3. Implement load balancing across all replicas
  4. Monitor distribution of read traffic

Resizing Replicas

Change replica resource profile:

  1. Navigate to replica in dashboard
  2. Click Resize
  3. Select new profile
  4. Confirm resize

Note: Replica can have different profile than primary

Removing Replicas

Delete unused replicas:

  1. Navigate to replica in dashboard
  2. Click Delete
  3. Confirm deletion
  4. Update application configuration to remove endpoint

Synchronous Replication

For critical data requiring zero data loss:

Enabling Synchronous Replication

  1. Navigate to your Redis instance
  2. Click Settings > Replication
  3. Enable Synchronous Replication
  4. Set Minimum Replicas (e.g., 1)
  5. Click Save

How It Works

With synchronous replication:

  • Write operations wait for acknowledgment from replicas
  • Guarantees zero data loss on failover
  • Higher latency for write operations (typically +5-10ms)
  • Write fails if minimum replicas not available

Trade-offs

Pros:

  • Zero data loss guarantee
  • Strong consistency
  • Perfect for financial/critical data

Cons:

  • Increased write latency
  • Reduced write throughput
  • Availability depends on replica health

Best Practices

Application Design

  1. Separate Connections: Use different connections for primary and replicas
  2. Read from Replicas: Route all read traffic to replicas when possible
  3. Write to Primary: Always write to primary
  4. Handle Failover: Implement retry logic with exponential backoff
  5. Connection Pooling: Use pools for both primary and replica connections

Scaling Strategy

  1. Start with One Replica: Provide high availability
  2. Add Replicas for Reads: Scale horizontally as needed
  3. Cross-Region Replica: Add for disaster recovery
  4. Monitor Load: Watch primary and replica utilization
  5. Load Balance: Distribute reads evenly across replicas

High Availability

  1. Enable Auto-Failover: Critical for production
  2. Multiple Replicas: At least 2 for redundancy
  3. Cross-AZ Deployment: Replicas in different availability zones
  4. Regular Testing: Test failover procedures monthly
  5. Monitoring and Alerts: Set up alerts for replication lag

Performance

  1. Connection Pooling: Reuse connections efficiently
  2. Pipeline Commands: Batch operations when possible
  3. Monitor Lag: Keep replication lag under 100ms
  4. Right-Size Resources: Ensure replicas have adequate resources
  5. Network Proximity: Place replicas close to application servers

Troubleshooting

High Replication Lag

Symptoms: Replica falling behind primary

Causes:

  • High write load on primary
  • Network bandwidth limitations
  • Undersized replica resources
  • Large bulk operations

Solutions:

  • Upgrade replica to larger profile
  • Optimize write operations on primary
  • Split large operations into smaller batches
  • Check network connectivity
  • Monitor primary CPU/memory usage

Replica Connection Failures

Symptoms: Cannot connect to replica

Solutions:

  • Check replica status in dashboard
  • Verify connection details and credentials
  • Test with redis-cli
  • Check firewall rules
  • Review application logs for errors

Replica Out of Sync

Symptoms: Replication status shows "sync_in_progress" or "disconnected"

Solutions:

  • Check primary and replica health
  • Verify network connectivity
  • Review replication logs in dashboard
  • Rebuild replica if necessary
  • Contact support if issue persists

Failover Not Working

Symptoms: Primary fails but replica not promoted

Causes:

  • Auto-failover not enabled
  • No healthy replicas available
  • Replication lag too high
  • Network partitioning

Solutions:

  • Verify auto-failover is enabled
  • Check replica health status
  • Ensure replicas are online and synced
  • Manually promote replica if needed
  • Review failover logs

Related Documentation