6

How to I exactly get the acknowledgement from Kafka once the message is consumed or processed. Might sound stupid but is there any way to know the start and end offset of that message for which the ack has been received ?

Hild
  • 2,415
  • 3
  • 19
  • 22

2 Answers2

1

What I found so far is in 0.8 they have introduced the following way to choose from the offset for reading ..

kafka.api.OffsetRequest.EarliestTime() finds the beginning of the data in the logs and starts streaming from there, kafka.api.OffsetRequest.LatestTime() will only stream new messages.

example code https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example

Still not sure about the acknowledgement part

Hild
  • 2,415
  • 3
  • 19
  • 22
1

Kafka isn't really structured to do this. To understand why, review the design documentation here.

In order to provide an exactly-once acknowledgement, you would need to create some external tracking system for your application, where you explicitly write acknowledgements and implement locks over the transaction id's in order to ensure things are only ever processed once. The computational cost of implementing such as system is extraordinarily high, and is one of the main reasons that large transactional systems require comparatively exotic hardware and have arguably lower scalability than systems such as Kafka.

If you do not require strong durability semantics, you can use the groups API to keep rough track of when the last message was read. This ensures that every message is read at least once. Note that since the groups API does not provide you the ability to explicitly track your applications own processing logic, that your actual processing guarantees are fairly weak in this scenario. Schemes that rely on idempotent processing are common in this environment.

Alternatively, you may use the poorly-named SimpleConsumer API (it is quite complex to use), which enables you to explicitly track timestamps within your application. This is the highest level of processing guarantee that can be achieved through the native Kafka API's since it enables you to track your applications own processing of the data that is read from the queue.

Ed Kohlwey
  • 448
  • 2
  • 8