Over the last few years microservices having become the preferred approach for the implementation of complex, cloud based, software systems. These software architectures are based on decomposition of functionality into small bounded units that cooperate to achieve the overall system goals. The benefits of microservices are well documented elsewhere so will not be repeated here. Instead the focus of this article is microservices coordination – how they should communicate and collaborate. On initial consideration this may seem like a fairly minor issue, but the reality is that it can have a significant impact on the behaviour, performance and operation of the solution.
The most obvious approach to communication between microservices is to use RPC (remote procedure calls) or the tried and trusted REST APIs. These make use of HTTP protocols and are used extensively for access to services over the internet. However, although REST is great for internet APIs, it is not the ideal choice inside a microservices architecture despite widespread use for that purpose. For me, a message bus is clearly a much better choice. It provides superior performance, increased flexibility and agility and a simpler overall architecture with less moving parts.
Let’s be clear what we are referring to here though. We aren’t talking about the ESBs (Enterprise Service Buses) that so painfully bloated with functionality and made themselves the centre of service oriented architectures in the past. We’re talking about simple, straightforward message buses that follow the modern principle of “do one thing and do it well“. Message buses like RabbitMQ, Kafka, NSQ, NATS etc.
So why are message buses the best way to integrate microservices? Here’s the list:
1 – Simplified Service Discovery
It’s highly likely that in a microservices architecture we will have more than one instance of each microservice in operation at the same time. In most cases we will have many. Certainly for any critical microservices we will need at least two instances, regardless of workload, in order to provide resilience against failures. Further, these instances may disappear and be reappear over time due to scaling and temporary failure conditions. So how does a client microservice know what network address to use to find an instance of a microservice it needs? We have to implement a service discovery mechanism. This isn’t straightforward though and you will find many (and I mean many) posts and articles online on the topic. I won’t go into the options here because if you use a message bus you won’t need them!
With a message bus a client microservice can simply send a request, or message, to a named queue on the bus (terminology may differ dependent on the message bus you use, but we will stick with “queue” here). All instances of the microservice which can service the request will read from the same queue as “competing consumers“. Simple. No complex service discovery, no service registries, no dns magic, none of that complexity. Just share an environment variable with the microservices which identifies the queue name and you are good to go!
2 – No Load Balancers
As mentioned above when discussing service discovery, we will normally have multiple instances of each microservice in operation at any given time. We saw that this provides a challenge in knowing what network address to find a microservice on. There is another associated question that needs to be answered though. Given that we can find out the addresses of the microservice instances, which one of the many should a client use at a given time? We don’t want to be overloading some instances whilst others remain idle. This is where load balancers come in. They may be physical hardware load balancers or software virtual load balancers. They may be client side or server side. But they are always there and provide an algorithm that balances requests across microservice instances. There also has to be mechanisms to update the load balancers when the instances appear and disappear in addition to some interaction with the service discovery as the two are so closely related.
Alternatively, we could just use a service bus. With the service bus approach the microservices can operate in a pull model, taking a message from the shared message bus queue when they have the capacity to do so. This means that load balancing amongst instances happens naturally and there is no need to run an algorithm to estimate (“guess“) which instance to push the next connection to. As well as removing the need for load balancers and their associated management, the pull based approach used with the message bus brings some significant performance improvements ….
3 – Improved Performance
Based on how rarely it appears in comparisons and tutorials, this seems to be the least understood aspect of message buses versus REST despite its importance. Pull based message bus systems have a clear and mathematically provable performance advantage over load balanced approaches. It probably deserves a complete blog post in itself, but lets just cover the basics.
Microservice instances, and indeed all servers, fundamentally behave like queuing systems. Due to variability in processing times and variability in arrival times of requests they exhibit some pretty nasty nonlinear behaviour. As utilisation goes up latency increases rapidly. At some point, well below 100% utilisation, latency increases exponentially and the instances will quickly become clogged up and unresponsive. The rate at which latency increases and the point at which things get ugly is influenced by the degree of variability which is something that we have little control over.
The advantage of running competing consumers with a message bus is that the shared queue will provide what is known as variability pooling. This reduces the effective variability and provides asymptotically optimal performance across the instances. The entire set of instances will behave as a shared queue, multiserver system. This means that it is easier to keep latencies under control and this can be achieved at higher levels of utilisation. As a result, we can reliably meet performance targets on latency and throughput and we can do this at a higher level of utilisation which means less server costs! All of this is unlike load balanced systems in which all of the instances, and therefore the system as a whole, will continue to behave in much the same manner as a single instance.
4 – Better Scaling Control
In order for a system to be as cost-effective as possible in the cloud it needs to be dynamically scalable. As the workload on a given microservice increases we need to add more instances. As it drops we need to remove instances. This can be achieved by establishing the limiting metric (CPU, memory or network) and monitoring the average of that metric across all instances. This seems straightforward, but in reality it can be difficult to ensure that scaling occurs in a timely fashion and to establish the average metric values to switch on due to the highly nonlinear relationship between load and metrics.
With a message bus there is an alternative approach which can improve this situation dramatically. Use the input message queue length as the measurement metric that drives the autoscaling! In a system of computing consumers on a shared queue there is a direct linear relationship between the length of the queue and the latency of the service. This makes it easier to establish an average value to switch at and means that the scaling is less sensitive to errors in the selected value. Not all of the popular message bus implementations provide easy methods for extracting the average queue length metric, but if you can get access to it then it will prove to be a big bonus for your autoscaling solution.
5 – Loose Coupling
Message bus systems lend themselves readily to the implementation of event based interactions due to their inherent support for pub-sub (publish-subscribe) communications. With this approach to microservice coordination we can achieve extremely loose coupling between microservices which has great benefits for integration, testability, agility and resilience of the system.
In the event based approach, rather than sending a commands such as “send verification email to new user“, a microservice will simply publish parameterised events such as “new user created : name : email address” to a defined message queue. Other microservices can then listen for events which interest them and perform actions that they deem to be appropriate. The producing microservice is not required to know anything about the subscribing microservices! This means that changes can be made to workflows by adding or modifying microservices in the system without any changes whatsoever to other components of the system. Another advantage of this approach is that any failure or delay in an event being processed will not stall upstream microservices.
It is also worth noting that streams of events in the system can be persisted to storage. They can then be “replayed” when required for testing or for re-establishment of correct system state after failures. This improves the resilience and testability of the system at the expensive of eventual consistency.
Conclusions
We all tend to naturally gravitate towards REST or RPC based approaches for the coordination of microservices. They appear to be natural answers. While this isn’t wrong, and many large systems are successful in using this approach, there are clear benefits to be gained from using a message bus. Not all of the benefits are important for all microservices so you may even choose to mix the styles. But for any parts of a solution that are required to support a high throughput with low latency and high resilience the message bus approach should absolutely be the default choice.