camel-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Block <andy.bl...@gmail.com>
Subject Manual commit strategies of Kafka Component
Date Thu, 03 Sep 2015 03:48:19 GMT
Hello all,

I have been working recently with the Kafka component and have run into issues where the component
handles committing offsets back to Kafka manually instead of automatically. When the component
was created, there were two commit strategies: Automatic and batch. The default functionality
of automatically committing offsets works well, however if there is a desire for more fine
grained control of the commit actions back to Kafka, the functionality does perform less than
desired. 

As mentioned in CAMEL-8975 there is a likelihood that messages will in fact get lost when
using the batching strategy. The issue details a number of scenarios where this can occur
and I have confirmed the majority of the use cases presented. A new strategy is currently
being developed where the commit of offsets can be deferred and handled later in the processing
pipeline. Each of these strategies has a fundamental flaw due to the Kafka architecture. As
soon as a message is retrieved from Kafka into the Consumer and then subsequently into the
processing pipeline, Kafka assumes the message was delivered and any commit of the offsets
will include any messages received. This poses a problem especially when failures do occur.
For example, lets take the new deferred strategy being developed in a recent pull request.
Let envision a route was created that consumed messages from kafka and then executed the commit
of offset in a processor later in the route. At time A, a message was consumed by kafka and
began down the processing pipeline. Just prior to the the first message entering the processor
containing the commit logic, 5 more messages were consumed from Kafka (This would occur if
multiple consumer streams were configured). At time B, the commit of the offsets was performed
when the first message reached the processor containing the commit offset action. To kafka,
the offsets for all 6 messages would be committed. If any exception occurred afterwards, the
other 5 messages would be lost. If the route was brought down and restarted, Kafka would begin
to read new messages after the 6th message. 

This is just one scenario, and certainly careful exception handling logic could mitigate message
loss, it still emphasizes some additional functionality that needs to be added to the Kafka
consumer when handling manual commits in Kafka.

If anyone is interested in helping develop solutions to improve the performance of the Camel
Kafka component, please reach out. I am very confident the Camel community can work together
to develop a solution optimize the Kafka Camel component.

Thanks,
Andy

-- 
Andrew Block

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message