What’s wrong with kafka?

3 min readDec 7, 2022

Today, Kafka is used in all major software architectures in FAANG and other companies. It is an essential tool when micro-services want to connect to each other asynchronously. But while it is made to work for most of the cases out-of-the-box, it might not prove to be the ideal tool for you.

Problem: Kafka has an inbuilt DefaultPartitioner responsible for distributing messages to partitions evenly. Whenever we push a payload to Kafka, it takes an input parameter partition_key and applies murmur2 hash function to calculate the partition number where the message would go.

Now, this hash function works for most of the use cases as the hash function puts its best effort to distribute the messages evenly across all the partitions, but even then, it doesn’t do it perfectly.

There are cases where we want to have finer control over the load on each partition, and this tool doesn’t work out-of-the-box.

Example:

In Ecommerce space, we generate a random OrderId and use it as a partition_key hoping that Kafka would equally distribute payloads to all the partitions. But we you check the number of messages on each partition, its usually skewed. It is a normal shaped distribution (as below) where some partitions get a higher share of messages while some partitions are starved or under-utilized.

The consumers consuming from over-loaded partitions start to show lag while the consumers consuming from comparatively free partitions sit ideal.

Even if you pass numbers 1–100 as partition_key to 100 partitions, you would expect every partition to get one message but that does NOT happen.

That’s because of the inbuilt DefaultPartitioner that Kafka uses to distribute messages and uses murmur2 hash function to guess the partition number the payload would go to.

It is good for most of the cases because many a times the key space is skewed. So, the hash function helps to distribute the messages evenly to every partition. But if we want more finer control, then we have to take the steering wheel in our hands.

Solution:

There are two possibilites.

Either we write our own PartitionerClass which would take the partition_key and decide which partition the payload should go.
Some libraries provide another parameter with partition_key like partition, where we can calculate and tell Kafka the exact partition the payload can reside.

For example, in below image, aioKafkaProducer provides a parameter in send API to tell kafka the exact partition number where our payload should go.

This is one of the ways by which we can have exactly same number of messages on each partition and same load on all consumers as well.

References:

What’s wrong with kafka?

Written by Rajat Jain