Designers Take Note! Design Trends For 2018 Are Here.

2018 is all about contrast: blending vectors with photographs, retro patterns with modern colors and typography, two-dimensional environments with three-dimensional designs, and static illustrations…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Manual message acknowledgment in Apache Kafka

Before we continue with this article make sure to read my previous article ‘Async REST services with messaging’ Since I will only cover manual message acknowledgment and its components.

Ok, so let’s dive in straight to the topic. As you know Kafka has a way to acknowledge the message whether it was consumed or read by the server. This process can be done automatically or manually. Doing it automatically will definitely lower our workload but sometimes it might misread a message and save it as read. To avoid this kind of problem we can use manual message acknowledgment. Keep in mind that this process is kind of complicated and should only be used if your familiar with it.

Let’s look at it with an example

I have created a Kafka topic with 3 partitions named as ‘employee-topic’.

Afterwards the consumer and the producer (I have used Node.js to demonstrate).

Producer.js

Consumer.js

When we autoCommit messages, we can either give it a threshold telling Kafka to send the status of a specific number of messages to the server whether it's read or not or set a time limit to get a number of messages and send the status of it. To set a threshold we can use the ‘autoCommitThreshhold’ property and to set a time limit we can use the ‘autoCommitInterval’ property.

But in this case, we’re going to change the autoCommit property to ‘false’ since we’re going to use manual message commit. In manual message commit or manual message acknowledgment, we have the liberty of processing every single message as shown in line 12 in consumer.js or processing it batch-wise using the ‘eachBatch’ property. Keep in mind that ‘eachMessage’ property will not commit every single message. Instead, it will get a batch of messages and process it one by one.

If you may recall, I mentioned a concept called ‘Offset’. This simply means, keeping the count of the last read message. If there are 5 messages, the 4th message goes to Kafka, sends it to the server and the Offset will become 4. The same goes for the 5th message.

The above screenshot displays the initial stage of manual message acknowledgment. Since we manually acknowledging each message, we must send the completion status to the server as shown in lines 24–28. I have purposely slowed the processing time to 1 sec. which mean each message will process after 1 second (line 22).

Output:

But some messages might have issues. To simulate that scenario, I have purposefully thrown an error at a specific offset. In this case, I have programmed it to throw an error at the 25th Offset.

Output:

As programed, it threw an error after reading the 25th Offset. Since there’s an error it will try to read repetitively until it is solved. But if you look closely, it has read the previously completed messages as well. This will result in slow performance. What we actually need is to read only that particular message. This is because the offset is not the last read message, but the last published message. so, we must tell Kafka to read the last published message by adding 1 to the offset.

Output:

As you can see, after reading the 25th offset, it’ll only retry that specific message. We can also set the number of retries if necessary. Now then during this time, the other messages are queuing up. What do you think might happen when we fix the problem? As soon as we fix it, the load balancer will automatically distribute the messages among active consumers. Why did I say active consumers? Next let’s move on to that.

Is it practical to feed a dead fish? In the same way Kafka can’t send any messages to a terminated consumer. It has to know whether its sill active or not. If there’s an inactive consumer, it will dispatch it from the Kafka cluster. Usually, this scenario takes place when the message processing takes more time. During that time Kafka must receive kind of an update on the consumers. There are multiple ways to restrict Kafka from removing a consumer. The most practical way is using the ‘sessionTimeOut’. The ‘sessionTimeout’ is literally a session time out that can be set to 30sec or any preferable amount according to your application. During this time, the consumer has to tell Kafka that it’s still alive and can handle messages. To do this, we can use a property called ‘heartbeat’. The heartbeat is something like a beacon which tells Kafka that the consumer is still alive. You can send this heartbeat during the session time Out which prevents a consumer from being destroyed.

In conclusion, manual message acknowledgment is challenging; we must think through all potential outcomes. That is why, as I previously indicated, you should only use this if you are absolutely certain in your ability to handle it correctly. However, the upside is that, at the end of the day, you get to acknowledge every single message.

Dinesh, K., 2022. Guide to Kafka — part 02 (with Demo)| manual ack, heartbeat, batch process. s.l.:Youtube.

Add a comment

Related posts:

The sincerest form of flattery

Writing technical articles is hard work. I wrote my first one in 1993 for Dr. Dobb’s Journal (a link, more or less), and since then I have written a couple of dozen more. Last year I wrote three…

Subconscious Thoughts While Writing

So for my own clarification as well as yours, here are some ruminations that are always at the forefront: Sometimes it seems like it’s just about everyone. One of the reasons Pastor Rick Warren’s…

Are You a Hedgehog or a Fox?

Borrowing the line from the Greek poet Archilochus, Berlin was writing about the way we interact with the world. Where the Fox seizes upon a variety of experiences, the Hedgehog is driven by a…