Dec 13, 2024

CQRS and Event Sourcing

CQRS

In a traditional distributed system, you have a set of clients who are interacting with a set of services and a backend data store. The interactions with the datastore are generally referred as CRUD operations (Create, Read, Update, Delete).

The backend services which contain the business logic define the data models required for the datastore. Services use the same model for both reading and writing to the datastore.

Issues with traditional approach

  • Traditioinally, Data models are designed to be write optimized. While reading we often need to do a lot of joins and aggregations to get the data in the format we need. Join and aggregations are resource intensive operations and can cause performance issues.
  • In applications which are read heavy, we have to scale the entire system to handle the read traffic. This can be expensive and inefficient.
  • With single model, the data modification logic can get intertwined with the data retrieval logic. This can make it difficult to maintain and scale the application. This also make the codebase hard to test, debug and maintain.

CQRS Pattern

The "command" in CQRS refers to operations that modify data, such as creating, updating, or deleting records. These operations are typically handled by a separate component, often called a "command handler."

The "query" in CQRS refers to operations that retrieve data, such as reading records from a database. These operations are typically handled by a separate component, often called a "query handler."

Notice how we have the write database and read database separated in above diagram. This separation helps us to scale the read and write operations independently.

Benefits of CQRS

  1. Scalability: By separating the write and read operations, you can scale them independently. For example, you can scale the read operations by adding more read replicas or using a caching layer.

  2. Performance: By separating the write and read operations, you can optimize each operation for its specific needs. For example, you can use a write-optimized database for write operations and a read-optimized database for read operations.

  3. Flexibility: By separating the write and read operations, you can use different technologies for each operation. For example, you can use a relational database for write operations and a document-based database for read operations.

  4. Simplicity: By separating the write and read operations, you can simplify the codebase. For example, you can use a single model for both write and read operations, or you can use a different model for each operation.

Quick Comparison

CRUD vs. CQRS Architecture Comparison
AspectTraditional CRUDCQRS
DatabaseSingle shared databaseSeparate write/read databases
SchemaCompromise for reads+writesWrite: Normalized Read: Denormalized
PerformanceReads block writes (and vice versa)Independent scaling
ComplexitySimple but inflexibleMore complex but adaptable
Use CaseLow-traffic appsHigh-scale systems
Data ConsistencyStrong consistencyEventual consistency
MaintenanceEasier to understandRequires sync mechanisms

Sync mechanisms

CQRS pattern requires a way to synchronize data between the write and read databases. There are several ways to achieve this synchronization:

  1. Change Data Capture (CDC)
  • Tracks database changes (inserts/updates/deletes) at the transaction log level.
  • Propagates these changes to read models.
  • Tools: Debezium (Kafka Connect), SQL Server CDC, PostgreSQL Logical Decoding

Pros:

  • Near real-time sync
  • Low impact on write DB
  • Works with legacy systems

Cons:

  • Complex to set up
  • Database-specific
  1. Event driven messaging
  • Command side publishes events (e.g., ProductPriceUpdated) to a message broker.
  • Query side subscribes and updates read models.
  • Tools: Kafka, RabbitMQ, Azure Event Hubs

Pros:

  • Decouples components
  • Scalable (multiple consumers)
  • Replayable events

Cons:

  • Requires event versioning
  • Eventual consistency lag
  1. Batch Synchronization
  • Periodic jobs (e.g., hourly/daily) rebuild read models from source data.
  1. Transactional Outbox Pattern
  • Write changes to an outbox table in the same transaction.
  • Separate process polls the outbox and publishes events.

Real world examples

  • Uber: Uses Kafka for event streaming between write and read models.
  • Netflix: Combines CDC (for DB changes) with Kafka (for business events).
  • Banking Systems: Often use transactional outbox for auditability.

Trade offs

ScenarioTraditional CRUDCQRSTradeoffs
Simple CRUD Apps Best fit (e.g., admin dashboards) OverkillCRUD: Simpler code. CQRS: Unnecessary complexity.
High-Read Workloads Struggles (e.g., analytics) Optimized (e.g., product catalogs)CRUD: Reads block writes. CQRS: Independent read scaling.
Complex Business Logic Mixed validation/reads Isolates rules (e.g., orders)CRUD: Hard to maintain. CQRS: Needs sync logic.
Audit/Compliance Needs Manual logging Built-in history (with ES)CQRS+ES: Automatic audit trail. CRUD: Extra effort required.
Microservices Shared DB coupling Decouples via eventsCRUD: Hard to scale. CQRS: Needs Kafka/RabbitMQ.
Real-Time Dashboards Slow joins Pre-computed viewsCRUD: Poor latency. CQRS: Eventual consistency lag.
Legacy Systems Easy migration High rewrite costCRUD: Lower risk. CQRS: Requires full redesign.
Team Skill Level Junior-friendly Needs DDD/ES expertiseCRUD: Faster onboarding. CQRS: Steep learning curve.

Event Sourcing

The concept of event sourcing shifts the focus from storing the current state of the system ( in db ) to recording the series of events that occured over time to reach that state.

Here's how it works :

  1. Every change to the system is captured as a event. Events are immutable and cant be changed.
  2. Events are persisted in a event store. This store is append only in nature.
  3. The current state of an entity is derived by replaying the sequence of events that have occurred for that entity.
  4. For entities with a long history of events, replaying all events every time the current state is needed can be inefficient.

Here's a sample flow :

Why use event sourcing ?

  1. Auditability : The event store provides a complete history of all changes made to the system, making it easier to track and audit changes.
  2. Immutable : Events are immutable, which means that once an event is recorded, it cannot be changed or deleted. This makes it easier to maintain a consistent and accurate history of changes.
  3. Scalability : Event sourcing can help to improve the scalability of a system by allowing the system to handle large volumes of data and transactions.
  4. Fault tolerance : Event sourcing can help to improve the fault tolerance of a system by allowing the system to recover from failures and continue to operate correctly.