Files
Compass/docs/CONNECTION_POOL_FIXES.md
2026-03-06 15:27:49 +01:00

3.3 KiB

Connection Pooling Fixes Summary

Issue Identified

The Compass application was experiencing database connection pool exhaustion with the error:

MaxClientsInSessionMode: max clients reached - in Session mode max clients are limited to pool_size

This was causing API endpoints like /get-channel-messages to fail with generic "Error getting messages" errors.

Root Causes

  1. Insufficient Pool Size: Original configuration had max: 20 connections which was inadequate for production load
  2. Poor Timeout Configuration: Query timeout was set to 1 hour which could cause connections to hang indefinitely
  3. Generic Error Handling: Database connection errors were masked with generic error messages
  4. Missing Pool Configuration: Idle timeout and connection timeout settings were not properly configured

Fixes Implemented

1. Enhanced Error Handling

// In get-private-messages.ts
if (error) {
  console.error('Error getting messages:', error)
  // If it's a connection pool error, provide more specific error message
  if (error.message && error.message.includes('MaxClientsInSessionMode')) {
    throw APIErrors.serviceUnavailable(
      'Service temporarily unavailable due to high demand. Please try again in a moment.',
    )
  }
  throw APIErrors.internalServerError('Error getting messages', {
    field: 'database',
    context: error.message || 'Unknown database error',
  })
}

2. Optimized Pool Configuration

// In init.ts - Main connection pool
const client = newClient({
  query_timeout: 30000, // Reduced from 1 hour to 30 seconds
  max: 30, // Increased from 20 to 30 connections
})

// Pool configuration
pool.idleTimeoutMillis = 30000 // Close idle connections after 30 seconds
pool.connectionTimeoutMillis = 10000 // Timeout for acquiring connection (10 seconds)

3. Separate Short Timeout Pool

// Dedicated pool for operations that should complete quickly
const shortTimeoutPgpClient = newClient({
  query_timeout: 30000,
  max: 10, // Smaller pool for specialized use cases
})

pool.idleTimeoutMillis = 15000 // Shorter idle timeout
pool.connectionTimeoutMillis = 5000 // Shorter connection timeout

Monitoring Improvements

Enhanced connection pool monitoring with metrics:

  • pg/connections_established - Track new connections
  • pg/connections_terminated - Track connection closures
  • pg/connections_acquired - Track connection acquisition
  • pg/connections_released - Track connection release
  • pg/pool_connections - Monitor pool state (waiting, idle, expired, total)

Best Practices Documented

Created comprehensive documentation covering:

  • Connection pooling fundamentals
  • Configuration tuning for different environments
  • Error handling patterns
  • Monitoring and alerting strategies
  • Common pitfalls to avoid

Impact

These changes should significantly reduce:

  • Connection pool exhaustion errors
  • API response times under load
  • Resource consumption
  • User-visible errors due to database issues

Validation

To verify the fixes are working:

  1. Monitor logs for the absence of "MaxClientsInSessionMode" errors
  2. Check metrics for healthy pool utilization
  3. Test high-concurrency scenarios
  4. Verify API response times remain consistent under load

Implemented: March 2026