kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Brown <tombrow...@gmail.com>
Subject Re: How do you keep track of offset in a partition
Date Tue, 29 Jan 2013 03:09:29 GMT
Since offsets in Kafka 0.7x are just byte counts, you cannot know the
number of messages remaining to be processed (subtract your consumers
offsets from each partitions end offset). However, you can know the
number of bytes remaining. Knowing the average message size, you can
use that to make a rough guess as to how many messages are remaining.

--Tom

On Mon, Jan 28, 2013 at 8:03 PM, S Ahmed <sahmed1020@gmail.com> wrote:
> Once you have an offset, is it possible to know how many messages there are
> from that point to the end? (or least for the particular topic partition
> that you are requested data from?).
>
> The idea is to get an idea how far behind the consumers are from the # of
> messages coming in etc.
>
> I'm guessing the broker's dont' really know how many messages they are
> currently storing?  Or is that what the index is for?
>
>
>
>
> On Mon, Jan 28, 2013 at 8:27 PM, Neha Narkhede <neha.narkhede@gmail.com>wrote:
>
>> Jamie,
>>
>> You need to use the getOffsetsBefore() API to get the earliest/latest
>> offset available on the broker for a particular partition.
>>
>> Thanks,
>> Neha
>>
>>
>> On Mon, Jan 28, 2013 at 5:05 PM, Jamie Wang <jamie.wang@actuate.com>
>> wrote:
>>
>> > Hi,
>> >
>> > We are using 0.72 version of Kafka on Windows. I am wondering what is the
>> > right way to fetch data and keep track of offset in a partition. For
>> > example, I am currently assuming the first message the producer sent to
>> the
>> > broker is at offset 0. So far it seems working. Is this correct
>> assumption?
>> >
>> > Let' say 2 days later, the first 100 messages on the broker is discarded
>> > because it passed retention.hours set in the config file. Now what is the
>> > offset I should use to retrieve the first message in the partition?  And
>> > let's also say the offset I had for the 80th message is now not valid.
>> > What is the right way to get the correct offset to fetch in the consumer?
>> >
>> > What is the purpose of the api for getting a list of valid offsets for
>> all
>> > segments in a partition?
>> >
>> > Thanks in advance for your help.
>> >
>> > Jamie
>> >
>>

Mime
View raw message