3 min read

Designing a Database for Thousands of Users at the Same Time

When thousands of users access a database simultaneously, contention becomes the primary challenge. Multiple users reading and writing the same data at the same time can lead to inconsistent results, deadlocks, and performance degradation. This article explores how databases handle concurrent access.

Key sources: "Database Internals" by Alex Petrov, PostgreSQL documentation, "High Performance MySQL" by Baron Schwartz.


The Concurrency Problem

Consider a ticket booking system. Two users try to buy the last ticket simultaneously:

User A: Check availability → 1 ticket left
User B: Check availability → 1 ticket left
User A: Book ticket → Success
User B: Book ticket → Success (but only 1 ticket existed!)

Without proper concurrency control, both users believe they got the last ticket. This is a classic race condition.


Isolation Levels

Databases use isolation levels to control how transactions interact. The SQL standard defines four levels:

| Isolation Level | Dirty Read | Non-Repeatable Read | Phantom Read | |----------------|-----------|---------------------|--------------| | Read Uncommitted | Possible | Possible | Possible | | Read Committed | Prevented | Possible | Possible | | Repeatable Read | Prevented | Prevented | Possible | | Serializable | Prevented | Prevented | Prevented |

Read Uncommitted

A transaction can read data written by another uncommitted transaction. If the other transaction rolls back, the reader has seen data that never existed (dirty read).

Use case: Rarely used. Only for approximate data where accuracy does not matter.

Read Committed (PostgreSQL default)

A transaction only reads data that has been committed. This prevents dirty reads. However, two reads in the same transaction may return different results if another transaction commits between them (non-repeatable read).

-- Transaction A
BEGIN;
SELECT balance FROM accounts WHERE id = 1;  -- Returns $100
-- Transaction B commits a $50 deduction
SELECT balance FROM accounts WHERE id = 1;  -- Returns $50 (different!)
COMMIT;

Use case: Default for most applications. Good balance of performance and consistency.

Repeatable Read

A transaction sees a consistent snapshot of the data as of the start of the transaction. Repeated reads return the same result, even if other transactions commit changes.

-- Transaction A
BEGIN;
SELECT balance FROM accounts WHERE id = 1;  -- Returns $100
-- Transaction B commits a $50 deduction
SELECT balance FROM accounts WHERE id = 1;  -- Still returns $100 (same!)
COMMIT;

Use case: Reporting, long-running reads, applications that need consistent snapshots.

Serializable

Transactions are executed as if they ran one at a time. This is the strongest isolation level. It prevents all anomalies but has the highest performance cost.

Use case: Financial transactions, inventory management, any system where correctness is critical.


Locking Mechanisms

Row-Level Locking

The database locks individual rows to prevent concurrent modifications:

BEGIN;
SELECT * FROM tickets WHERE id = 100 FOR UPDATE;
-- Other transactions cannot modify or read this row with FOR UPDATE
UPDATE tickets SET sold = true WHERE id = 100;
COMMIT;
-- Lock is released

The SELECT ... FOR UPDATE acquires an exclusive lock on the row. Other transactions that try to lock or modify the same row wait until the lock is released.

Deadlocks

A deadlock occurs when two transactions each hold a lock that the other needs:

Transaction A                      Transaction B
  Lock row 1                         Lock row 2
  Request lock row 2 ← BLOCKED       Request lock row 1 ← BLOCKED

Both transactions wait forever. Databases detect deadlocks and abort one of the transactions, allowing the other to proceed.

Prevention: Always access rows in the same order. If both transactions lock row 1 first, then row 2, deadlocks cannot occur.

Optimistic Locking

Instead of locking rows, optimistic locking assumes conflicts are rare. Each row has a version number:

-- Read the current version
SELECT id, balance, version FROM accounts WHERE id = 1;
-- Returns: id=1, balance=100, version=5

-- Update only if version matches
UPDATE accounts
SET balance = 150, version = 6
WHERE id = 1 AND version = 5;
-- If another transaction updated this row, version is no longer 5
-- The UPDATE affects 0 rows → conflict detected

This approach works well when conflicts are rare (read-heavy workloads) and retries are acceptable.


Connection Pooling

Each database connection consumes memory (typically 5-10 MB). With thousands of users, opening a connection per user is not feasible. Connection pooling reuses a fixed set of connections:

Application Thread ─→ Connection Pool ─→ Database
                                    │
                              ┌─────┴─────┐
                              │ Connection │
                              │ Connection │
                              │ Connection │
                              └───────────┘

A pool of 50-100 connections can handle thousands of concurrent requests because:

  1. Most requests spend time waiting for I/O (network, disk)
  2. A connection is used briefly for the database query, then returned to the pool
  3. The pool queues requests when all connections are busy

Read Replicas

For read-heavy workloads, read replicas distribute the load:

Writes → Primary Database
Reads  → Replica 1, Replica 2, Replica 3 (load balanced)

This separates read traffic from write traffic. The primary handles writes and replicates to replicas asynchronously. Applications send read queries to replicas and write queries to the primary.

Caveat: Asynchronous replication means replicas may be slightly behind the primary (replication lag). A user who just submitted a form may not see their change when reading from a replica.


Sharding

When a single database instance cannot handle the load, data is sharded (partitioned) across multiple instances:

User ID 1-1000 → Database Server A
User ID 1001-2000 → Database Server B
User ID 2001-3000 → Database Server C

Each shard is an independent database. Sharding distributes both storage and query load. The application must know which shard to query (or use a proxy that routes queries).


Key Takeaways

  1. Concurrent database access requires isolation levels to prevent race conditions.
  2. Read Committed is the default for most applications. Serializable is for correctness-critical operations.
  3. Row-level locking prevents concurrent modifications but can cause deadlocks.
  4. Optimistic locking works well for read-heavy workloads with rare conflicts.
  5. Connection pooling allows a small number of database connections to serve many concurrent users.
  6. Read replicas distribute read traffic. Sharding distributes both reads and writes.

Design principle: Choose the weakest isolation level that meets your correctness requirements. Higher isolation has a performance cost.