incubator-kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Brown <tombrow...@gmail.com>
Subject Re: too many duplicate messages
Date Tue, 27 Nov 2012 04:54:12 GMT
I don't know if this is applicable to your problem, but I've been
thinking about this issue for some time. As I see it, the "true"
position in the topic can be stored by adding an extra number. Store
the offset of the current message set, and store the offset of the
"current" message within the message set.

You could end up downloading the same message set from the server
multiple times, but at least you won't replay the duplicates.

--Tom

On Mon, Nov 26, 2012 at 11:01 AM, Mark Grabois <mark.grabois@trendrr.com> wrote:
> correct, it has an ID
>
>
> On Sun, Nov 25, 2012 at 10:48 PM, S Ahmed <sahmed1020@gmail.com> wrote:
>
>> How would you potentially discover a duplicate message, I guess your
>> message has a id/guid?
>>
>>
>> On Fri, Nov 16, 2012 at 8:50 PM, Joel Koshy <jjkoshy.w@gmail.com> wrote:
>>
>> > With compression enabled (as you have) it is possible for a consumer to
>> see
>> > duplicates during rebalance. This is because iteration may be in the
>> middle
>> > of a compressed message set just before a rebalance, but the checkpointed
>> > offsets are at MessageSet boundaries. However, this would only be during
>> > rebalance - i.e., in steady state, when you have no change in #
>> consumers/#
>> > partitions you shouldn't see duplicates.
>> >
>> > Joel
>> >
>> >
>> > On Fri, Nov 16, 2012 at 10:02 AM, Mark Grabois <mark.grabois@trendrr.com
>> > >wrote:
>> >
>> > > https://gist.github.com/4089354.git
>> > >
>> > > https://gist.github.com/4089369.git
>> > >
>> > >
>> > >
>> > > On Fri, Nov 16, 2012 at 12:59 PM, Mark Grabois <
>> mark.grabois@trendrr.com
>> > > >wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > I'm encountering a problem where i'm getting far too many duplicate
>> > > > messages being sent to my kafka setup (statically, using
>> broker.list),
>> > > > being picked up by zk-based consumers.
>> > > >
>> > > > I've provided my test classes here and the kafka/zk versions i'm
>> using
>> > to
>> > > > run them and my servers:
>> > > >
>> > > > *client side*:
>> > > > producer: git://gist.github.com/4089354.git
>> > > > consumer: git://gist.github.com/4089369.git
>> > > > *jars*:
>> > > > kafka-0.7.2
>> > > >
>> > > > *server side*:
>> > > > 5 kafka servers, zk servers on 3 of those
>> > > > 1 partition per test topic per server
>> > > > *server versions*:
>> > > > kafka-0.7.1
>> > > > zookeeper-3.4.3
>> > > >
>> > > > Any advice would be greatly appreciated.
>> > > >
>> > > > Thank you,
>> > > > Mark
>> > > >
>> > > >
>> > > >
>> > >
>> >
>>

Mime
View raw message