samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Taggart <>
Subject Re: Controlling read offset
Date Mon, 10 Nov 2014 18:33:33 GMT
Thanks, Chris.

It's not clear to me from the documentation whether the checkpoint tool can
be used to control the starting offset for a job that has not yet ever been
run, and if so, how the properties file would need to be crafted.  The
checkpoint doc page doesn't show what the properties file looks like, and
trying to dump one with the script doesn't work with hello-samza.

On Mon, Nov 10, 2014 at 11:36 AM, Chris Riccomini <> wrote:

> Hey Alexander,
> We have a checkpoint offset tool
> (./samza-shell/src/main/bash/, which allows you to read
> and write offsets for all input partitions. This tool will allow you to
> arbitrarily set offsets before a job starts.
> We also support the samza.offset.default, and samza.reset.offset
> configurations:
> ion-table.html#streams
> These allow you to specify whether a job should read from the head or tail
> of an input stream when the job first starts.
> We don't currently support a way to change offsets once a job has already
> started. If you can get more specific about your use case,
> Cheers,
> Chris
> On 11/10/14 6:53 AM, "Alexander Taggart" <> wrote:
> >We're investigating using Samza, and one aspect of our usage would require
> >being able to start a job such that it begins reading from a specified
> >Kafka offset.  If I understand correctly, each job being bound to a
> >specific partition would need to be provided with a specific offset.  Is
> >there any facility for providing such values, either via config or via
> >API?  If not, what might be a good approach to implementing it (e.g., a
> >custom kafka consumer)?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message