flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tao Li <litao.bupt...@gmail.com>
Subject [HDFSEventSink] Endless loop when HDFSEventSink.process() thorws exception
Date Fri, 17 Apr 2015 16:04:12 GMT
Hi all:

My use case is KafkaChannel + HDFSEventSink.

I found that SinkRunner.PollingRunner will call HDFSEventSink.process() in
a while loop. For example, a message in kafka contains dirty data, so
HDFSEventSink.process() consume message from kafka, throws exception
because of *dirty data*, and *kafka offset doesn't commit*. And the outer
loop, will continue call HDFSEventSink.process(). Because the kafka offset
doesn't change, so HDFSEventSink will consume the dirty data *again*. The
bad loop is *never stopped*.

*I want to know that if we have a **mechanism to cover this case?* For
example, we have a max retry num for a unique HDFSEventSink.process() call
and give up when exceed max limit.

Mime
View raw message