Event driven HATEOAS for ALT Service Mesh
Today with the proliferation of Event driven and reactive architectures, we are seeing true decoupling of applications.
With open-source distributed event streaming platforms like Kafka, the contractual obligations for applications are changing from dependencies on APIs to the flexibility of processing messages at the consumer end of a Kafka pipeline.
If you are already familiar with Service Mesh pattern and HATEOAS principle then you can skip the next section.
If I were to oversimplify the description of Hypermedia As The Engine Of Application State, I would say that usually in Restful GET api call, we get data in return. With a HATEOAS get call, you get links to forwarding actions in the returned data. As an example (a simplistic example) supposing you have an accounting app that allows you to list all accounts and some details of those accounts which you own across different banks. This is what the get call for accounts looks like
Now lets say you want to enhance this banking app to allow withdrawal from any bank. But how could that be possible as the different banking apps could be listed ay different URLs. With HATEOAS principle, you get a list of links in the get result that are forwarding URLs to actionable links. Enhancing the above JSON to adhere to HATEOAS, we come up with
Below the accounts section you now see a links section that has URLs to actions one can perform on the accounts. Of course this is not a practical application of the HATEOAS principle as it is insecure to do so. But more on that later. Lets talk about the Service Mesh pattern next.
The service mesh is typically implemented as a scalable set of network proxies deployed alongside application code (a pattern sometimes called a sidecar). These proxies handle the communication between the microservices and also act as a point at which the service mesh features can be introduced. The proxies comprise the service mesh’s data plane, and are controlled as a whole by its control plane.
The rise of the service mesh is tied to the rise of the “cloud native” application. In the cloud native world, an application might consist of hundreds of services; each service might have thousands of instances; and each of those instances might be in a constantly-changing state as they are dynamically scheduled by an orchestrator like Kubernetes. Not only is service-to-service communication in this world incredibly complex, it’s a fundamental part of the application’s runtime behavior. Managing it is vital to ensuring end-to-end performance, reliability, and security.
Now that we understand a little bit about HATEOAS principle and the ‘side car’ pattern, lets talk about Kafka.
“Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.”
More than 80% of all Fortune 100 companies trust, and use Kafka. Apache Kafka is an open-source distributed event…
It’s a scalable, fault-tolerant, publish-subscribe messaging system that enables you to build distributed applications and powers web-scale Internet companies such as LinkedIn, Twitter, AirBnB, and many others.
Our discussion about Kafka in this article pertains to Event driven architectures. What are those?
Event driven architecture
Event-driven architecture is a software architecture and model for application design. With an event-driven system, the capture, communication, processing, and persistence of events are the core structure of the solution. This differs from a traditional request-driven model.
Many modern application designs are event-driven, such as customer engagement frameworks that must utilize customer data in real time. Event-driven apps can be created in any programming language because event-driven is a programming approach, not a language. Event-driven architecture enables minimal coupling, which makes it a good option for modern, distributed application architectures.
An event-driven architecture is loosely coupled because event producers don’t know which event consumers are listening for an event, and the event doesn’t know what the consequences are of its occurrence.
I am not going to go into too much details about HATEOAS, Service Mesh, Kafka or event driven architectures. You can find tremendous resources online on these topics and those would be infinitely more helpful than any info I might post in this one about these topics.
Instead I want you, the reader to educate yourself about these concepts (at least get a primer) so that the next bit seems intuitive and informative to you.
Let us finally get to the CRUX of this article.
Event driven HATEOAS for ALT Service Mesh
With an even driven architecture, you can achieve true decoupling of applications. With pub sub, applications listen for events and establish contracts based on those events. Producers produce messages in a format that consumers can understand and consumers can be written in any language or platform that is Kafka compliant.
With Kafka and HATEOAS we are talking about Service Mesh like capability for applications that are still not completely cloud native but do adhere to REST.
The intent is to publish new services as links within Kafka messages and these messages could contain
- Forwarding addresses to new endpoints that expose new capability within applications.
- A service discovery like interface to list services for an api and discover new ones.
I know all this must sound fairly confusing to you. Don’t worry I will explain with examples.
Lets look at an example:
We see in the diagram, a backend service for Bank 1 and a backend service for bank 2. Both are exposing
— Account balance check
From the bank’s point of view:
In step 1, Bank 1 and Bank 2 will register their service endpoints with the Orchestrator which acts like a service directory for these disparate banking applications. The messages are pushed to the orchestrator using a Pub/Sub model so as to completely decouple the orchestrator with the Banking apps. This also makes it easier to onboard a new Banking app easily and seemlessly onto the orchestrator.
The APIs are stored as actions on the Orchestrator side. Each individual app also provides the Load Balanced URL to make sure that the API endpoints are maintained with the orchestrator along with the load balanced URL.
The Orchestrator is a an application that maintains all these services in a persistence layer. It isn’t playing a role of load balancer or gateway; Rather it is maintaining a record of gateway URLs(Load balanced URL root) and API actions(to be appended to the URL root) to the banking app backends.
This is convenient because if tomorrow Bank 2 wants to move to a new domain, it just has to send a message with the new root url to the spring Boot orchestrator and the orchestrator will update the urls against the application key.
From the customer’s point of view:
A Customer exists that has accounts in both banks. The withdrawal process for the customer should be seamless as they are the same customer having accounts in 2 banks.
In Step 1 for the Customer, the customer first chooses an account and then sends a request to the Orchestrator to get a list of actions pertaining to that bank for their account. These actions, once received on the client side are shown to the Customer as actions. So introducing a new action also becomes trivial for any banking app as the first request from the customer is always to ‘discover’ what services are available to them.
In step 2, The customer then chooses an action and in another call through the queue, it gets a URL for the withdraw action pertaining to their chosen account and bank in the next call. The received message follows HATEOAS to provide the withdraw links in the message body itself.
Please note, the customer talks to the orchestrator through a queue to maintain the loose coupling. Again, this helps to introduce a new customer onto the platform seamlessly.
Once the customer has the forwarding address,
Step 3 is to make a direct call to the banking app to perform the withdraw request.
With this framework we ensure the following
- Any bank can be onboarded at any time.
- Any banking service can be added any time.
- Banks can move to new domains anytime and introduce new services into the system anytime.
- We can do a one time auth on the orchestrator to authenticate the user across the disparate banking apps.
- The customer can easily check what services are available and invoke the service for the correct bank by just providing Auth info, account id, action and amount one time, thus saving time.
- Disaster recovery is real time as Bank 1 simply needs to update the urls to its DR sites.
This is just an example for your understanding and not necessarily a valid use case.
Make sure to maintain all statefull info on the orchestrator in a persistence layer to make sure that if the orchestrator fails or goes down, when it resumes it resumes from where it left off.
Therefore, with this architecture we achieve a true decoupling of applications and also allow applications to dynamically introduce new APIs without costly server restarts.
Of course, with a Kubernetes cluster and a service mesh implementation like Istio, this is easier but if you are working on monolithic rest applications that aren’t being containerized anytime soon then with this pattern, you will find it a lot more easier to start carving out micro apps and eventually micro services.
Because its event driven, new events can be introduced into the system at any time with minimal changes in subscribing applications.
So in essence, with a combination of
> Rest APIs
> Kafka Queues
> An Orchestrator application
We are able to develop independently scalable, fault tolerant and resilient application stacks that can be moved, replaced, restarted or enhanced at any time in the application life cycle without needing a restart and subsequent down time.