15 May 2026 1 min read system-design

Building a Centralized Logging System for Multiple Applications

When you run one application, logs are simple — read the file. When you run fifty microservices across hundreds of servers, logs are scattered everywhere. Finding a specific error requires connecting to multiple servers and grepping multiple files. A centralized logging system solves this.

Key sources: ELK Stack documentation, "Site Reliability Engineering" (Beyer, Jones, Petoff, Murphy).

The Problem

A microservice application running on 50 servers generates gigabytes of logs daily. When a user reports an error, the support team needs to:

Find which service generated the error
Find which server the request ran on
Find the relevant log entries
Correlate them with logs from other services

Without centralized logging, this takes hours. With it, it takes seconds.

Architecture

Services → Log Shipper → Buffer → Indexer → Storage → Search API │ Optional: Enrichment (add metadata, parse structured fields)

Log Shipper (Filebeat, Fluentd, Vector)

Installed on every server. It tails log files and forwards them to the central system. It handles:

Reading new log entries as they appear
Parsing structured log formats (JSON, logfmt)
Adding metadata (server name, service name, timestamp)
Buffering locally if the central system is unavailable
Compressing and sending data efficiently

Buffer (Kafka, Redis)

A message queue that decouples log shippers from the indexer. If the indexer goes down, logs accumulate in the buffer. When it recovers, it processes the backlog.

Indexer (Logstash, Elasticsearch)

Parses, transforms, and indexes log entries. Each field becomes searchable: service_name, severity, timestamp, trace_id, message.

Storage and Search (Elasticsearch)

Elasticsearch stores logs as inverted indexes. Queries are fast even on terabytes of logs:

Search: service_name:"payment" AND severity:"error" Result: All error logs from the payment service in the last 24 hours

Structured Logging

The foundation of good centralized logging is structured logs. Instead of printing unstructured text:

python print(f"User {user_id} paid ${amount}")

Use structured format that the log shipper can parse:

python import structlog logger.info("payment_processed", user_id=user_id, amount=amount)

Output:

json {"event": "payment_processed", "user_id": 123, "amount": 99.99, "timestamp": "...", "service": "payment"}

Every field is filterable and searchable.

Key Takeaways

Centralized logging aggregates logs from all services into a single searchable system.
The architecture: log shipper → buffer → indexer → storage → search.
Structured logging (JSON) makes logs machine-parseable and searchable.
Log levels (debug, info, warn, error) enable filtering by severity.
Correlation IDs trace a request across multiple services in logs.

The Problem

Architecture

Log Shipper (Filebeat, Fluentd, Vector)

Buffer (Kafka, Redis)

Indexer (Logstash, Elasticsearch)

Storage and Search (Elasticsearch)

Structured Logging

Key Takeaways

You might also like...

🌐 Polling vs Long Polling vs WebSockets: How Apps Stay Updated Instantly

Nilai dari 'Boring Technology': Kenapa Stack Biasa Sering Menang

Why the Observer Pattern Powers Modern Frontend Frameworks

Why Thinking Out Loud Makes You a Better Engineer

Why Great System Design Always Starts With Better Questions