samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagadish Venkatraman <>
Subject Re: [Multiple producers for one task]
Date Mon, 16 Nov 2015 14:57:08 GMT
Hi Aram,

I assume that based on the message fields, you would want to output to
Cassandra, Graphite etc.

A single samza job is an implementation of the StreamTask/WindowableTask
interface. Samza will create multiple instances of your implementation and
assign it to containers.

Having a single samza job vs multiple samza jobs - If you have multiple
jobs, they can be stopped, started, managed, maintained independently. It's
still possible to do this in a single job and you can scale out using
multiple containers.

Sync vs Async producers: It entirely depends on how you implement your
producer. Do you care about ordering? ie, within the same partition, do you
want to preserve ordering in your writes to Cassandra/Graphite? An instance
of your producer is shared across all task-instances in the container.

Having multiple producers for multiple systems seems cleaner to me since
the system characteristics are different.


On Mon, Nov 16, 2015 at 3:36 AM, Aram Mkrtchyan <> wrote:

> Hi guys,
> We're processing JSON data from Kafka using Samza, and we'd like to have a
> single Samza Job that's able to process and produce the messages to
> different systems.
> For example, consume messages from kafka, and produce them to Cassandra,
> Graphite and other systems, so that the messages are being consumed once.
> We want this because the tasks themselves are very simple, and we don't
> want to have separate samza jobs for them.
> We'd like someone to compare possible approaches.
>    1. Having Multiple producer systems for one task.
>    2. Having Single producer which has registry of small message handlers,
>    which process messages (synchronous/asynchronous)?
>    3. Having Multiple Jobs is the only valid way of doing it.
> Thanks.

Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message