Arun Manglick - Microservices : 2020

Friday, December 25, 2020

Make Microservices Communicate

Read Post to understand What/Why Microservices - http://arun-architect.blogspot.com/2016/11/microservices-soa.html

In the era of microservice architecture, applications are built via a collection of services. Each service in a microservice architecture solves a business problem in the application, or at least supports one and each service in the collection tends to meet the following criteria - Loosely coupled, Maintainable and testable, Can be independently deployed etc.

Microservice architectures offers different benefits.

They are often easier to build and maintain
Services are organized around business problems
They increase productivity and speed
They encourage autonomous, independent teams
Supports different programming languages

Well despite of all such benefits, communication between microservices is the key to make it successful, else MS can wreak havoc if communication is considered ahead of time.

This post going to focus on three ways describing how services can communicate in a microservice architecture. Nothing is perfect and depends on need of the hour.

HTTP Communication
Message-Based Communication
Event-driven Communication

HTTP Communication / Broker-Less Design: This is the fundamental one, where assume we have two services in our architecture. Service1 process a piece of business logic and then calls over to Service2 to run another piece of business logic.

Here we make our microservices talk to each other directly. You could use HTTP for traditional request-response or use web-sockets (or HTTP2) for streaming. There is absolutely no intermediary nodes (except routers and load balancers) between two or more microservices. You can connect to any service directly, provided you know their service address and the API used by them.

HTTP calls between services is a viable option for service-to-service communication. Thus though it's basic, we could make a case that all communication channels derive from this one.

This communication can be Synchronous or Asynchronous, depending on your use-case. Well with asynchronous approach, we keep the services isolated from one another, and the coupling is loose.

Pros: Low Latency, Easy to Implement, Easy Debugging, High Throughput - Actual CPU Cycles are spent on doing work rather than routing.

Cons:

Connection Nightmare & Resource Loss - May lead to lot of idle connections as multiple MS need to connect to each other leading lot of connections and many of those may remain fairly idle,
Tightly Coupled: By nature, brokerless designs are tightly coupled. Imagine you have a microservice to process online payments. Now you want another microservice to give you a real-time update of number of payments happening per minute. This will require you to make modifications in multiple microservices which is undesirable.

Thus In many cases a broker-less design just doesn’t work. You often have requirements to simply publish the message once and have multiple subscribers consume it. This is where a broker design comes into the picture.

Message Broker-Based Communication / Broker Design: Unlike HTTP communication, the services involved do not directly communicate with each other. Instead, the services push messages to a message broker that other services subscribe to. This eliminates a lot of complexity associated with HTTP communication.

I.e. In this architecture, all communication is routed via a group of brokers. Brokers (E.g. AWS SNS, RabbitMQ, Amazon MQ, Nats, Kafka, etc.) are server programs running some advanced routing algorithms. Each microservice connects to a broker. The microservice can send and receive messages via the same connection. Here service sends messages which are then published to a particular “topic.”. This topic is read by consumers who have subscribed to it.

It doesn’t require services to know how to talk to one another; it removes the need for services to call each other directly. Instead, all services know of a message broker, and they push messages to that broker. Other services can choose to subscribe to the messages in the broker that they care about.

Amazon provides AWS SNS as message broker. Now Service1 can push messages to an SNS topic that Service2 listens on. Here is any case, calling party needs to know some identifier confirming call been placed to message broker, then broker can return the MessageId to the caller.

Please note that there is still some coupling between the two services using this pattern. I.e. Services must agree on what the message structure is and what it contains.

Sample Arch of Message Based Communication: https://aws.amazon.com/blogs/compute/building-loosely-coupled-scalable-c-applications-with-amazon-sqs-and-amazon-sns/

Event Driven Communication: This is another asynchronous approach, and it looks to completely remove the coupling between services. Unlike the messaging pattern where the services must know of a common message structure, an event-driven approach doesn’t need this.

An event-driven architecture uses events to trigger and communicate between decoupled services and is common in modern applications built with microservices.

Thus communication between services takes place via events that individual services produce. A message broker is still needed (E.g. SNS) here since individual services will write their events to it. But, unlike the message approach, the consuming services don’t need to know the details of the event; they react to the occurrence of the event, not the message the event may or may not deliver.

Every service agrees to push events to the broker (SNS here), which keeps the communication loosely coupled. Services can listen to the events that they care about, and they know what logic to run in response to them. This pattern keeps services loosely coupled as no payloads are included in the event. Each service in this approach reacts to the occurrence of an event to run its business logic. Here, we are sending events via an SNS topic. Other events could be used, such as file uploads or database row updates.

Note: An event is a change in state, or an update, like an item being placed in a shopping cart, Item placed in S3, Message pushed to SQS, Message Published to SNS etc. Events can either carry the state (the item purchased, its price, and a delivery address) or events can be identifiers (a notification that an order was shipped).

Event-driven architectures have three key components: event producers, event routers, and event consumers. A producer publishes an event to the router, which filters and pushes the events to consumers. Producer services and consumer services are decoupled, which allows them to be scaled, updated, and deployed independently.

Important Note: There are two main types of routers used in Event-driven Architectures:

Event Topics - At AWS, AWS SNS is used to build event topics.
Event Buses - At AWS, we offer Amazon EventBridge to build event buses

Amazon SNS - Recommended when you want to build an application that reacts to high throughput and low latency events published by other applications, microservices, or AWS services, or for applications that need very high fanout (thousands or millions of endpoints). SNS topics are agnostic to the event schema coming through.

Amazon EventBridge - Recommended when you want to build an application that reacts to events from SaaS applications, AWS services, or custom applications. EventBridge uses a predefined schema for events and allows you to create rules that are applied across the entire event body to filter before pushing to consumers. For serverless event-driven applications, EventBridge makes it easy to discover event schemas, create a registry of events, and use a robust, highly available event bus to connect data across services.

Ref: Amazon EventBridge

More Real Life Example of Event-Driven Communication:

Assume there is an e-commerce application having four major functions -

Order & Inventory Mgmt - Accepts Order, Lock Inventory Item, Pass Control to Billing & Finance.
Billing & Finance - Works on Customer Payment, Supplier Payment and Pass Control to Delivery
Delivery - Prepares Order for Delivery
Customer Mgmt - Manages Customer Data & Interactions.

Mapping this domain structure into Microservices, there are four microservices - Order Processing, Billing, Delivery and Customer.

With these microservices in system, assuming communication between services is thru messages, which are sent as events, below will be real Event-Driven Communication.

Request first hits Order Processing Service.
Once it locks the item, fires 'ORDER_PROCESSING_COMPLETED' event.
As it's Event Driven Design, there can be multiple services listening to this event. Here Billing Service picks up the event and starts processing payment. Also Sub-Microservice 'Inventory Service' picks for updating Inventory.
Billing Service then makes billing and fires 'PAYMENT_PROCESSING_COMPLETED' Event.
This event is further listen by 'Delivery Service' and makes delivery and finally fires another event as 'ORDER_DISPATCHED'

Benefits of an event-driven architecture

Scale and fail independently - By decoupling your services, they are only aware of the event router, not each other. I.e if one service has a failure, the rest will keep running.
Develop with agility - You no longer need to write custom code to poll, filter, and route events; the event router will automatically filter and push events to consumers.
Audit with ease - An event router acts as a centralized location to audit your application and define policies.
Cut costs - Event-driven architectures are push-based, so everything happens on-demand as the event presents itself in the router. This way, you’re not paying for continuous polling to check for an event.

Hope this helps!!

Arun Manglick

Thursday, October 29, 2020

gRPC

Introduction:

In past (Before Web Services & REST) , RPC was a popular inter-process communication, for building client-server application in distributed systems. Main objective of RPC was to make the process of executing code on a remote machine as simple as calling a local function. Well most conventional RPC approach (CORBA) and few more, have drawbacks, such a s high complexity involved, using TCP protocol etc.

gRPC, originally developed by Google, now it's open source, focuses on high performance inter-service communication technology.

gRPC, tries to overcome most of limitations of traditional RPC. It enables communication between services build using heterogeneous technologies. It is based on idea of defining service and specifying methods, that can be called remotely with their parameters and return types.

gRPC By Default, uses Protocol Buffers - Such as IDL (Interface Definition Language) to describe both the service interface and structure of payload messages.

gRPC uses HTTP2 as the Transport Protocol, a key reason for gRPC wide adaption.

HTTP2 overcome the problem in Http 1.1, where

Each connection can handle one request at a time. I.e. If the first request is blocked, then the next request has to wait. Therefore require to open multiple connections between client and server to support real uses cases.
Many Headers (User-Agent, Cookie) are repeated across multiple request, which is a waste of bandwidth.

HTTP2 overcome these problems

All communication between client and server happens over a single TCP connection that carry number of bi-direction flows - Concept named as 'Stream', which carries one or more messages, bi-directionally. (See at the last of this post).
Avoids Header repetition and introduces 'Header Compression' to optimize used of bandwidth.
Also introduced new feature of sending 'Server Push Messages' without using request-response style messages.

Building MS requires lot many things to think over:

Data Formats
Error Patterns
Think about data model (JSON/XML/Binary)
EndPoints
Load Balancing
Efficiency
Latency
Interoperability
Authentication/Authorization
Logging
Token Based
Scalablity etc etc.

What if there is some framework exists to take care of all such things - There is where gRPC helps.

What is gRPC:

It's a free and open-source framework developed by Google.
At the core of gRPC, developer need to define the Message (REQUEST and RESPONSE) and Services using Protocol Buffers.
Rest of gRPC code will be auto generated and as a developer, only need to provide implementation for it.

I.e. At high level, gRPC allows to define REQUEST and RESPONSE for RPC and handles the rest for you.

One .proto file works for over 12 programming languages (Server and Client) - And allows you to use a framework that scales to millions of RPC per seconds
As code can be generated in any language
gRPC globally -

Modern, Fast, Efficient,
Built on top of HTTP/HTTP2,
Low latency,
Support streaming,
Language Independent,
Plugged-in authentication,
Load Balancing,
Logging and Monitoring.

gRPC vs REST:

Features	gRPC	REST
Protocol	HTTP/2 (Fast)	HTTP/1.1 (Slow)
Payload	Protobuf (Binary & Small)	JSON (Text & Large)
API Contract	Strict, Required (.proto)	Loose, Optional (OpenAPI)
Code Generation	Built-In (Protocol)	3^rd Party Tools (Swagger)
Security	TLS/SSL	TLS/SSL
Streaming	Bi-Direction	Client to Server (Request Only
Browser Support	Limited (require gRPC-Web)	Yes

Where to use gRPC:

Microservices

Low latency and High Throughput Comminication
Strong API Contract

Polygot Environments

Code generation out of the box for many languages

Point to Point Real Time Communication

Excellent Support for Bi-Directional Streaming

Network Constrained Environments

Lightweight Message Format

Types of gRPC:

There are 4 types of gRPC:

Unary: The simplest one is Unary, where the client sends 1 single request message and the server replies with 1 single response. This looks somewhat similar to the normal HTTP REST API.
Client-Streaming - In this scenario, the client will send a stream of multiple messages or a sequence of messages to server and it expects the server to send back only 1 single response.
Server-Streaming - Here the client sends only 1 request message, and the server replies with a stream of multiple responses. The client reads from the returned stream until there are no messages.
Bidirectional (or bidi) streaming - This one is the most complex because client and server will keep sending and receiving multiple messages in parallel and with arbitrary order. It’s very flexible and no blocking, which means no sides need to wait for the response before sending the next messages. I.e. Server could wait to receive all client messages before sending response, it could alternately read message and then write a message or follow any other combination of read and writes.

Happy gRPC..

Arun Manglick

Protocol Buffers (a.k.a Protobuf)

Introduction:

Google's mature open source mechanism/protocol to enable serialization and deserialization of structured data between different services.
Method of serializing structured data. It is useful in developing programs to communicate with each other over a wire or for storing data.

gRPC By Default, uses Protocol Buffers, such as IDL (Interface Definition Language) to describe both the service interface and structure of payload messages.
gRPC uses HTTP2 as the Transport Protocol, a key reason for gRPC wide adaption.

Protocol buffers are Google's Language-Neutral, Platform-Neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler

I.e. “You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages.”

Google’s design goal was to

Create a better method than XML to make systems communicate with each other over a wire or for the storage of data. 3 Times Smaller & 10 Times Faster than XML
Better than Json in terms of size.

Protocol Buffers are 'Language Agnostic. I.e. Code can be generated in many languages
Data exchange is in 'Binary format', lowering payload size lower than anything (XML, JSON) making serialization efficient.

Note:

Facebook uses an equivalent protocol called Apache Thrift and
Microsoft uses Microsoft Bond Protocols in addition to a concrete RPC protocol stack used for defined services known as gRPC.

Few more:

Data is 'Fully Typed'
Data is 'Compressed Automatically' (Leading to less CPU usage)
Data can be read across 'Any Language' (e.g. C#, Java, Go, Python, JS etc)
Huge amount of code Code is generated automatically in any language from a simple .proto file.
Payload is Binary, thus very efficient to send/receive on a network and serialize/de-serialize on a CPU.
Protocol Buffers define rules to make an API evolve without breaking existing clients, which is helpful.

Protocol Buffers is defined by a .proto text file. (HelloWorld.proto)

Protocol Buffers Over XML & JSON:

With XML - 3 Times Smaller & 10 Times Faster than XML
With JSON - High CPU Intensive - Parsing JSON is CPU Intensive (Bcoz JSON is human readable)

How do Protocol Buffers and JSON Compare?

Protocol Buffers and JSON messages can be used interchangeably; however, they were designed with different goals in mind.

JSON arose out of a subset of the JavaScript programming language. Its messages are exchanged in text (human-readable) format, are completely independent and supported by almost all programming languages.

Protobuf is not only a message format. It is simultaneously a set of rules and tools that define and exchange the messages. It is currently restricted to only some programming languages. In addition, it has more data types than JSON, such as enumerates and methods, and has other functions, including RPC.

gRPC: At the core of gRPC, messages and services are defined using Protocol Buffers.

Happy Blogging!!

Arun Manglick