storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrian Landman <adrian.land...@gmail.com>
Subject Re: KafkaSpout offsets
Date Thu, 24 Jul 2014 20:08:29 GMT
Thanks!  That helps clear things up some.  So if forceFromStart is true it
will force it to start at the beginning.  If nothing is changed it will try
and start from the last committed offset, but if there is no committed
offset where will it start?  What if there is a saved offset, but we want
to force it to start at the end?  Or if we want to force a particular
offset, not the last saved one?  I'm guessing that based on public boolean
useStartOffsetTimeIfOffsetOutOfRange = true if an offset is found that is
out of the range, it will start at the start/beginning offset?

Essentially what I want to be able to specify the following conditions:
Start at the first (oldest) message on the topic: set forceFromStart = true
Start at the last (newest) message on the topic : ?
Start at the last saved offset : Don't change the config defaults
Start at an explicit offset: ? (I don't envision needing to use this, but
just in case)



On Thu, Jul 24, 2014 at 1:40 PM, Harsha <storm@harsha.io> wrote:

>  Hi Adrian,
>            If you set forceFromStart to true it calls KafkaApi.Offset to
> get the earliest time, which finds the beginning of the kafka logs and
> starts the streaming from there. By default this is set to false and it
> makes a request to Kafka to find whats the last committed offset and
> streams it from there. You can control how often kafka offset needs to be
> committed by using SpoutConfig.stateUpdateIntervalMs by default its 2000 ms.
> -Harsha
>
>
>
> On Thu, Jul 24, 2014, at 12:27 PM, Adrian Landman wrote:
>
> In nathanmarz/storm-contrib project there was a KafkaConfig that had a
> forceOffsetTime.  In our code someone had documented that calling this with
> different values would affect the offsets in the following way:
>
> -2 Will start at the beginning (earliest message) of the topic
> -1 Will start at the end (latest message) of the topic
> -3 Will start where the spout left off
> And anthing >0 will start at the specified offset.
>
> In the new project external/storm-kafka there is also a KafkaConfig and I
> see that it exposes
> public boolean forceFromStart = false;
> public long startOffsetTime = kafka.api.OffsetRequest.EarliestTime();
> public long maxOffsetBehind = 100000;
> public boolean useStartOffsetTimeIfOffsetOutOfRange = true;
>
> By default does this mean the spout will start at the beginning of the
> topic?  What does the forceFromStart do?  If we want to start from whatever
> offset the spout was last processing, is there anyway to do this?
>
>
>

Mime
View raw message