flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johny Rufus <jru...@cloudera.com>
Subject Re: Question Failure Behavior of HDFS Sink
Date Tue, 08 Sep 2015 18:03:25 GMT
Your assumption is correct, as duplicates in a failure scenario will occur.

Thanks,
Rufus

On Tue, Sep 8, 2015 at 4:10 AM, Aljoscha Krettek <aljoscha@apache.org>
wrote:

> Hi,
> as I understand it the HDFS sink uses the transaction system to verify
> that all the elements in a transaction are written. This is what I would
> call at-least-once semantics.
>
> My question is now what happens if the writing fails in the middle of
> writing the elements in the transaction. When the transaction is retried
> some of the elements might be written again, i.e. the output contains
> duplicates. Is this assumption correct or is there something in place that
> prevents this from happening?
>
> Thanks for your time,
> Aljoscha
>

Mime
View raw message