mirror of
https://github.com/CompassConnections/Compass.git
synced 2026-03-24 17:41:27 -04:00
109 lines
3.3 KiB
Markdown
109 lines
3.3 KiB
Markdown
# Connection Pooling Fixes Summary
|
|
|
|
## Issue Identified
|
|
|
|
The Compass application was experiencing database connection pool exhaustion with the error:
|
|
|
|
```
|
|
MaxClientsInSessionMode: max clients reached - in Session mode max clients are limited to pool_size
|
|
```
|
|
|
|
This was causing API endpoints like `/get-channel-messages` to fail with generic "Error getting messages" errors.
|
|
|
|
## Root Causes
|
|
|
|
1. **Insufficient Pool Size**: Original configuration had `max: 20` connections which was inadequate for production load
|
|
2. **Poor Timeout Configuration**: Query timeout was set to 1 hour which could cause connections to hang indefinitely
|
|
3. **Generic Error Handling**: Database connection errors were masked with generic error messages
|
|
4. **Missing Pool Configuration**: Idle timeout and connection timeout settings were not properly configured
|
|
|
|
## Fixes Implemented
|
|
|
|
### 1. Enhanced Error Handling
|
|
|
|
```typescript
|
|
// In get-private-messages.ts
|
|
if (error) {
|
|
console.error('Error getting messages:', error)
|
|
// If it's a connection pool error, provide more specific error message
|
|
if (error.message && error.message.includes('MaxClientsInSessionMode')) {
|
|
throw APIErrors.serviceUnavailable(
|
|
'Service temporarily unavailable due to high demand. Please try again in a moment.',
|
|
)
|
|
}
|
|
throw APIErrors.internalServerError('Error getting messages', {
|
|
field: 'database',
|
|
context: error.message || 'Unknown database error',
|
|
})
|
|
}
|
|
```
|
|
|
|
### 2. Optimized Pool Configuration
|
|
|
|
```typescript
|
|
// In init.ts - Main connection pool
|
|
const client = newClient({
|
|
query_timeout: 30000, // Reduced from 1 hour to 30 seconds
|
|
max: 30, // Increased from 20 to 30 connections
|
|
})
|
|
|
|
// Pool configuration
|
|
pool.idleTimeoutMillis = 30000 // Close idle connections after 30 seconds
|
|
pool.connectionTimeoutMillis = 10000 // Timeout for acquiring connection (10 seconds)
|
|
```
|
|
|
|
### 3. Separate Short Timeout Pool
|
|
|
|
```typescript
|
|
// Dedicated pool for operations that should complete quickly
|
|
const shortTimeoutPgpClient = newClient({
|
|
query_timeout: 30000,
|
|
max: 10, // Smaller pool for specialized use cases
|
|
})
|
|
|
|
pool.idleTimeoutMillis = 15000 // Shorter idle timeout
|
|
pool.connectionTimeoutMillis = 5000 // Shorter connection timeout
|
|
```
|
|
|
|
## Monitoring Improvements
|
|
|
|
Enhanced connection pool monitoring with metrics:
|
|
|
|
- `pg/connections_established` - Track new connections
|
|
- `pg/connections_terminated` - Track connection closures
|
|
- `pg/connections_acquired` - Track connection acquisition
|
|
- `pg/connections_released` - Track connection release
|
|
- `pg/pool_connections` - Monitor pool state (waiting, idle, expired, total)
|
|
|
|
## Best Practices Documented
|
|
|
|
Created comprehensive documentation covering:
|
|
|
|
- Connection pooling fundamentals
|
|
- Configuration tuning for different environments
|
|
- Error handling patterns
|
|
- Monitoring and alerting strategies
|
|
- Common pitfalls to avoid
|
|
|
|
## Impact
|
|
|
|
These changes should significantly reduce:
|
|
|
|
- Connection pool exhaustion errors
|
|
- API response times under load
|
|
- Resource consumption
|
|
- User-visible errors due to database issues
|
|
|
|
## Validation
|
|
|
|
To verify the fixes are working:
|
|
|
|
1. Monitor logs for the absence of "MaxClientsInSessionMode" errors
|
|
2. Check metrics for healthy pool utilization
|
|
3. Test high-concurrency scenarios
|
|
4. Verify API response times remain consistent under load
|
|
|
|
---
|
|
|
|
_Implemented: March 2026_
|