Building a Centralized Logging System for Multiple Applications
When you run one application, logs are simple — read the file. When you run fifty microservices across hundreds of servers, logs are scattered everywhere. Finding a specific error requires connecting to multiple servers and grepping multiple files. A centralized logging system solves this.
Key sources: ELK Stack documentation, "Site Reliability Engineering" (Beyer, Jones, Petoff, Murphy).
The Problem
A microservice application running on 50 servers generates gigabytes of logs daily. When a user reports an error, the support team needs to:
- Find which service generated the error
- Find which server the request ran on
- Find the relevant log entries
- Correlate them with logs from other services
Without centralized logging, this takes hours. With it, it takes seconds.
Architecture
Services → Log Shipper → Buffer → Indexer → Storage → Search API
│
Optional: Enrichment (add metadata, parse structured fields)
Log Shipper (Filebeat, Fluentd, Vector)
Installed on every server. It tails log files and forwards them to the central system. It handles:
- Reading new log entries as they appear
- Parsing structured log formats (JSON, logfmt)
- Adding metadata (server name, service name, timestamp)
- Buffering locally if the central system is unavailable
- Compressing and sending data efficiently
Buffer (Kafka, Redis)
A message queue that decouples log shippers from the indexer. If the indexer goes down, logs accumulate in the buffer. When it recovers, it processes the backlog.
Indexer (Logstash, Elasticsearch)
Parses, transforms, and indexes log entries. Each field becomes searchable: service_name, severity, timestamp, trace_id, message.
Storage and Search (Elasticsearch)
Elasticsearch stores logs as inverted indexes. Queries are fast even on terabytes of logs:
Search: service_name:"payment" AND severity:"error"
Result: All error logs from the payment service in the last 24 hours
Structured Logging
The foundation of good centralized logging is structured logs. Instead of printing unstructured text:
python
print(f"User {user_id} paid ${amount}")
Use structured format that the log shipper can parse:
python
import structlog
logger.info("payment_processed", user_id=user_id, amount=amount)
Output:
json
{"event": "payment_processed", "user_id": 123, "amount": 99.99, "timestamp": "...", "service": "payment"}
Every field is filterable and searchable.
Key Takeaways
- Centralized logging aggregates logs from all services into a single searchable system.
- The architecture: log shipper → buffer → indexer → storage → search.
- Structured logging (JSON) makes logs machine-parseable and searchable.
- Log levels (debug, info, warn, error) enable filtering by severity.
- Correlation IDs trace a request across multiple services in logs.