Saga Pattern in Microservices: An In-depth Overview #171
Replies: 4 comments 1 reply
-
|
Let's take the Travel Booking System scenario for an in-depth breakdown and implementation. System ArchitectureMicroservices:
Implementation Steps:Step 1: Setup Spring Boot Microservices.Use Spring Initializr or your IDE to create Spring Boot applications for each of the services. Step 2: Configure Data Source and JPAThis step is pretty standard for each service. Since we're using H2, each service would have configurations in their spring.datasource.url=jdbc:h2:mem:flightdb
spring.datasource.driver-class-name=org.h2.Driver
spring.jpa.hibernate.ddl-auto=update(The datasource URL differs per service) Step 3: Design Entities
Step 4: Design RepositoriesFor each entity, create a public interface FlightBookingRepository extends CrudRepository<FlightBooking, Long> {
Optional<FlightBooking> findByUserId(String userId);
}Step 5: Implement Business LogicThis is where it gets interesting. The main business logic of each service is to manage its respective bookings. However, if one service fails to book, it should trigger a compensating transaction to "undo" the bookings made by the previous services. Example: If CarRentalService fails, it should notify HotelBookingService and FlightBookingService to cancel their respective bookings. Step 6: Introduce Messaging for Inter-service CommunicationWe use Spring Cloud Stream with RabbitMQ/Kafka. Here's a brief setup:
Example for HotelBookingService: public interface HotelBookingChannels {
@Input("cancelHotelBookingChannel")
SubscribableChannel cancelHotelBooking();
}
@EnableBinding(HotelBookingChannels.class)
public class HotelBookingListener {
@Autowired
private HotelBookingRepository repository;
@StreamListener("cancelHotelBookingChannel")
public void listenForCancelBooking(String userId) {
// Logic to cancel booking
}
}Step 7: Implement the OrchestratorBookingOrchestratorService coordinates the steps of the saga. It calls each service in sequence, checks the response, and decides whether to proceed or to start compensating transactions. Step 8: Handle Failures and CompensationWhenever a service detects a failure, it sends a message to the RabbitMQ/Kafka topic that the other services are listening to. Those services then start their compensating transactions based on the message. Conclusion:The Saga pattern introduces a different way of thinking about system reliability. Instead of relying on a centralized transaction manager to keep everything in sync, each service in a saga is responsible for its local transaction and for participating in ensuring the system as a whole remains consistent. This requires a shift in mindset and can be more complex to implement, but offers the advantage of long-running, flexible, and collaborative transactions. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
@akash-coded : https://github.com/arghyagiri/microservice-e2/blob/main/e-commerce-microservices-event-driven-saga/Read.md |
Beta Was this translation helpful? Give feedback.
-
|
Hi @akash-coded |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Saga Pattern in Microservices: An In-depth Overview
What is the Saga Pattern?
When you shift from a monolithic to a microservices architecture, one of the first challenges you face is maintaining data consistency across services. In a monolithic system, you'd simply use a transaction. In microservices, due to the distributed nature, this isn't straightforward. That's where the saga pattern comes in.
A saga is a sequence of local transactions. Each transaction updates data within a single service and publishes an event to trigger the next transaction in the saga. If one transaction fails, the saga executes compensating transactions to undo the impact of the preceding transactions.
Why Use the Saga Pattern?
Maintaining Data Consistency: Microservices should be loosely coupled and have their own databases. Saga helps maintain consistency across these databases.
Long-Running Transactions: Some operations are inherently long-running. You don't want to lock resources for the entire duration. Saga offers a solution.
Types of Saga Patterns:
Choreography: Every local transaction publishes domain events. Other local transactions consume those events and execute, potentially emitting new events.
Orchestration: One service (the coordinator) is in charge of directing the saga. It tells other services to execute local transactions.
Advantages:
Resilience: Failure in one service doesn't bring down other services.
Loose Coupling: Services act based on events, not direct requests from other services.
Scalability: Asynchronous nature of sagas allows for better scalability.
Disadvantages:
Complexity: Implementing sagas can be complex due to compensating transactions and coordination.
Debugging and Tracing: A saga's asynchronous and potentially distributed nature can make debugging more challenging.
Alternatives:
Implementing Saga in Spring Boot:
For this, we'll use an example of an e-commerce platform where an order placement involves: checking stock, deducting stock, charging payment, and notifying the user.
1. Service Setup:
Imagine four services:
OrderService,InventoryService,PaymentService, andNotificationService.2. Saga Flow:
OrderServicereceives a request to place an order.InventoryServiceif the stock is available.InventoryServicededucts the stock and emits aStockDeductedEvent.PaymentServicelistens toStockDeductedEvent, charges the payment, and emits aPaymentChargedEvent.NotificationServicelistens toPaymentChargedEventand notifies the user.3. Implementing with Spring Boot and Kafka:
Compensating Transactions:
For any failure, compensating actions must be triggered. For instance, if the payment fails, the stock should be reverted.
Best Practices:
Idempotency: Ensure operations can be retried without adverse effects.
Logging and Monitoring: Given the complexity of sagas, strong logging and monitoring are crucial.
Event Versioning: If the structure of an event changes, it can break services. Version your events to handle changes gracefully.
Summary:
The saga pattern is a potent remedy for data consistency woes in microservices but comes with its own challenges. Properly implemented, it can bring resilience and scalability to your distributed system, ensuring a harmonious symphony of collaborating services in the microservices world.
Let's further explore the practical intricacies of the Saga Pattern, particularly when diving into a Spring Boot application.
Transaction Boundaries in Sagas:
The Saga pattern challenges the traditional way we think about transactions, especially in a microservice architecture. Each step in a saga performs its own local transaction, which then triggers the next step. If any step fails, compensating transactions in previously successful steps can be initiated.
Saga Coordination Types in Spring:
Choreography:
Orchestration:
Saga Implementations with Spring:
Choreography-based Saga with Kafka:
Kafka’s topic and partitioning mechanism naturally fits with domain events.
Orchestration-based Saga with RESTful API calls:
RestTemplateor Feign clients to make synchronous HTTP calls.Compensating Transactions:
createOrdersaga, if payment processing fails, a compensating transaction can cancel the order.Saga Persistence:
Event Sourcing:
Command Model:
Saga Execution and Query Consistency:
Eventual Consistency:
Commands and Queries:
Saga Rollback Mechanism:
Best Practices:
Let's delve deeper into the orchestration approach of the saga pattern and then illustrate it with an example for our e-commerce platform.
Saga Pattern - Orchestration Approach:
In the orchestration style, one service (often referred to as the 'saga orchestrator' or 'coordinator') takes charge of the saga's progress. It knows the order of steps, which service to call, and what to do in case of failures.
Pros:
Cons:
Orchestration in Action:
Considering our e-commerce platform, let's use an orchestration approach:
OrderServiceacts as the saga orchestrator.A customer places an order, and the saga begins.
OrderServiceasksInventoryServiceto check and reserve stock.OrderServiceresponds to the customer about the stock unavailability and the saga ends.If the stock is reserved successfully,
OrderServicethen asksPaymentServiceto charge the customer.OrderServicetellsInventoryServiceto release the reserved stock. The customer is informed of the payment failure, and the saga ends.With payment successful,
OrderServiceinstructsNotificationServiceto notify the customer of the order's success.This entire process is choreographed by the
OrderService. Each step's outcome determines the next service to call.Implementing Orchestration with Spring Boot:
Here, we will use Spring WebFlux to make asynchronous, non-blocking calls to services.
Note: This is a simplistic representation. In real-world scenarios, you'd handle timeouts, fallbacks, and potential failures more gracefully.
Summary:
The saga pattern, when orchestrated, can provide a clear view of the business logic. However, it comes with its own set of challenges, primarily the potential bottleneck and single point of failure. Proper design, asynchronous calls, and fallback mechanisms can help mitigate these challenges and craft a robust microservices environment.
Let's dive deeper into the Choreography approach of the saga pattern, how it differs from Orchestration, and then illustrate it with an example in the context of our e-commerce platform.
Saga Pattern - Choreography Approach:
In the choreography style, there isn't a central orchestrator that tells participants what to do. Instead, each service involved in the saga knows about its own local transaction and knows when to execute it. It also knows which event it needs to publish when it has finished its local transaction.
Pros:
Cons:
Choreography in Action:
Taking our e-commerce scenario:
A customer places an order in
OrderService.OrderServicesaves the order and emits anOrderPlacedEvent.InventoryServicelistens forOrderPlacedEvent, and once it catches this event, it checks and reserves the stock. If stock is reserved successfully, it emits aStockReservedEvent.StockUnavailableEvent.PaymentServicelistens forStockReservedEvent. Once it catches this event, it charges the customer.PaymentSuccessEvent.PaymentFailedEvent.OrderServicelistens forPaymentSuccessEventandPaymentFailedEventto update the order status accordingly.NotificationServicelistens to various events to notify the customer at different stages.Implementing Choreography with Spring Boot:
For this, we'll use Spring Cloud Stream – a framework for building message-driven microservices.
For each service, you'd include the following dependency:
Using
OrderServiceas an example:In
InventoryService:In
application.properties:This setup allows services to listen and react to events. When an event occurs in one service, other services that are subscribed to that event will automatically pick it up and process it accordingly.
Summary:
The choreographed saga pattern can seem more complex initially due to its distributed nature, but it offers more flexibility and scalability. It fits nicely in scenarios where you have a lot of services and you want to avoid making any one of them a potential bottleneck or single point of failure. The key is to carefully design your events and make sure each service knows how to handle them correctly.
We'll progress by examining the challenges with choreography, and subsequently, we'll explore advanced features and concepts associated with sagas and Spring Cloud Stream.
Challenges with Choreography in Sagas:
Event Propagation Delays: In a distributed system, events might not be instantaneous. An event from one service might be delayed due to network issues, causing another service to potentially make decisions on stale data.
Event Versioning: As your system evolves, so does the structure of your events. Managing different versions of events and ensuring backward compatibility can be challenging.
Event Tracking: Since there's no central orchestrator, monitoring which services have handled which events can become challenging.
Event Reliability: Ensuring that events are delivered at least once, and preferably only once, is a significant challenge.
Advanced Features with Spring Cloud Stream:
Consumer Groups:
PaymentService. Without consumer groups, both instances would get all the messages, leading to redundancy. With consumer groups, you can ensure each message is processed by only one instance.Stateful Processing:
Interactive Queries:
Choreography with Multiple Brokers:
It's also possible to utilize multiple brokers for different parts of the system. For instance, you might use Kafka for long-lived, durable events, while using RabbitMQ for real-time, short-lived notifications.
Saga with Choreography: Compound Scenario:
Imagine a user trying to book a holiday package.
BookingService (Central point to start the booking):
BookingServiceemits aHolidayBookingStartedEvent.FlightService:
HolidayBookingStartedEvent.FlightBookingConfirmedEvent.FlightBookingFailedEvent.HotelService:
FlightBookingConfirmedEvent.HotelBookingConfirmedEvent.HotelBookingFailedEvent.PaymentService:
HotelBookingConfirmedEvent.PaymentSuccessEvent.PaymentFailedEvent, causing bothFlightServiceandHotelServiceto rollback.NotificationService:
This approach ensures that each service reacts to events and takes appropriate actions. The benefits include high decoupling and scalability. The challenge is ensuring reliability and managing failures gracefully.
Conclusion:
Choreography-based sagas, especially in Spring Boot microservices architecture, can provide scalability and flexibility benefits. However, understanding and designing the event flow is crucial. Employing Spring Cloud Stream features can further enhance and refine the way events are processed, making the system robust and efficient. It's always vital to choose between orchestration and choreography based on the nature and requirements of the application.
The saga pattern inherently requires mechanisms to handle failures since distributed transactions are not atomic. If one step fails, we can't just roll back like in traditional ACID transactions. Instead, we use compensating transactions.
Saga Rollback Mechanism:
When we speak of a "rollback" in the context of sagas, it doesn't mean the same thing as a traditional database rollback. Instead, it means performing a compensating action that effectively undoes a previous action.
Compensating Transactions:
These are the heart of the saga rollback mechanism. For every transaction that modifies the system's state, there must be a corresponding compensating transaction that can reverse that modification.
For instance:
Scenario 1: E-commerce Order Failure
Imagine an e-commerce platform that has a three-step saga for processing orders:
If the debit operation fails (maybe due to insufficient funds), we can't ship the product, but we've already reserved it in step 1. In this scenario, the compensating action for the reservation would trigger, releasing the product back to the inventory.
Scenario 2: Travel Booking System
Consider a travel booking system where a user wants to book a flight, hotel, and rent a car:
Suppose the car rental service is down. In the saga pattern, the system will then execute compensating transactions: cancel the hotel room and flight booking. Thus, even if part of the operation fails, the user isn't left with partial reservations.
Challenges and Considerations:
Idempotence: Your compensating transactions need to be idempotent. Executing them multiple times shouldn't have a different effect than executing them once.
Network Failures: What if the service responsible for the compensating transaction is down or unreachable? You might need to implement a retry mechanism or a manual intervention procedure.
Data Consistency: While the saga pattern aims to maintain data consistency across services, there's a period where data is inconsistent until the saga completes or the compensating transactions execute.
Complexity: With sagas, you are trading off simplicity for scalability and flexibility. Instead of a single ACID transaction, you now have multiple transactions, each with their own compensating transactions.
Implementation using Spring Boot:
Spring doesn't offer out-of-the-box support for sagas, but you can combine several of its features to implement them:
Event Sourcing: Use Spring Kafka or RabbitMQ to produce events that represent state changes. If a failure occurs, produce compensating events.
Spring Cloud Stream: This can be used to handle the communication between microservices in an event-driven manner.
Database Per Service: Maintain data consistency within each service's boundary using traditional transactions. The overall consistency is maintained through sagas.
Orchestrator vs. Choreography: You can implement sagas using an orchestrating service that guides other services through the saga steps, or you can use a choreographed approach where every service knows its next step and possible compensations.
The heart of the Saga pattern lies in its compensating transactions. Each compensating transaction is essentially the 'undo' operation for a preceding successful operation. To explore this in-depth, let's start with some simpler scenarios and then build up to more complex, intertwined use cases.
1. Basic Scenario: E-commerce Order Placement
Services: InventoryService, OrderService, PaymentService, NotificationService
Flow:
OrderServicecreates an order.PaymentServicecharges the user.InventoryServicereduces the item count.NotificationServicesends an order confirmation to the user.Compensating Transactions:
OrderService).PaymentService) and cancel the order.Deep Dive into Rollback Mechanism:
Let's focus on the
InventoryServicefailing. What happens behind the scenes?a. InventoryService fails:
b. PaymentService listens to this:
InventoryFailureEvent.c. OrderService listens to
PaymentRefundedEvent:The above actions might appear straightforward, but remember, each of those steps could fail as well. What if
PaymentServicecouldn’t process the refund? You need mechanisms to retry, notify admins, or even manually intervene.2. Compound Scenario: Interdependent Services in a Financial App
Services: AccountService, TransferService, AuditService, NotificationService
Flow:
TransferServiceinitiates the transfer.AccountServicedebits the source account and credits the target account.AuditServicelogs the transfer.NotificationServicesends notifications to both sender and receiver.Compensating Transactions:
Deep Dive into the Compound Rollback Mechanism:
a. AuditService fails:
TransferServiceabout the failure.TransferServicethen sends commands toAccountServiceto undo both credit and debit actions.NotificationService).b. Complex Rollbacks:
In systems where operations aren’t merely opposites of one another, compensating transactions might involve complex logic. For instance, if an e-commerce price discount applied after a certain operation gets rolled back, compensating might mean recalculating the whole cart.
c. Manual Interventions:
Sometimes, it might be more logical to manually correct a failure, especially when human judgment is needed.
Compensating Transactions in Distributed Databases
While we've mostly discussed compensating transactions in terms of business logic, the same principles can apply at the database level. Distributed databases, which might span across regions or data centers, can use sagas to ensure consistency across nodes.
In case of a node failure or network partition, compensating transactions can ensure data remains consistent across all active nodes. This is especially prevalent in databases that value availability over immediate consistency (AP systems in the CAP theorem).
Summary:
In the saga pattern, events play a significant role, not only in driving forward the business process but also in driving backward (rolling back) when a compensating action is needed. Implementing Saga requires careful design, considering all possible failure scenarios, and ensuring that the system can either recover automatically or provide means for manual recovery. Proper logging, monitoring, and alerting become more crucial than ever.
Also, while REST or RPC can be used for synchronous inter-service communication, the async nature of event-driven mechanisms (like Kafka, RabbitMQ) aligns better with the Saga pattern, especially with its compensatory actions.
Lastly, while Sagas help maintain data consistency across services, they introduce eventual consistency into the system. This is a trade-off: you get autonomy and isolated failures, but it comes at the expense of the immediate consistency that a monolithic system might offer.
Beta Was this translation helpful? Give feedback.
All reactions