samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagadish Venkatraman <jagadish1...@gmail.com>
Subject Re: Per task/topic checkpoint?
Date Sat, 28 Oct 2017 15:11:46 GMT
In Samza, the logical unit of processing (and hence, checkpointing) is a
task. Hence, you cannot selectively checkpoint SSPs within a task.

However, you can configure how you group your SSPs into tasks by choosing a
Grouper. If you want to control checkpointing at the granularity of an SSP,
then you can choose the
org.apache.samza.container.grouper.stream.GroupBySystemStreamPartitionFactory.

Config reference:
https://samza.apache.org/learn/documentation/0.10/jobs/configuration-table.html

What are you trying to do? Maybe, there's a simpler way to achieve it?



On Sat, Oct 28, 2017 at 4:09 AM, Gaurav Agarwal <gauravagarwal4@gmail.com>
wrote:

> Hi All,
>
> If I had Samza Tasks that were consuming message from multiple topics, how
> would checkpoint/commit work in that case? On calling
> taskCordinator.commit(), would current offset of all topics be saved for
> the caller task  (only the partitions assigned to the caller task)? Is
> there a way to control this behavior more granularly where I can request
> samza to commit the offset for only a given task/topic combination only?
>
> --
> thanks,
> gaurav
>



-- 
Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message