samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Maes <jma...@apache.org>
Subject Re: Samza Standalone from multiple machines
Date Wed, 18 Oct 2017 15:57:51 GMT
Hey Giridhar,

Boris would have more knowledge on this, but I'll offer a couple pointers:

Running the "Hello Samza" application on multiple hosts should just be a
matter of changing the "job.coordinator.zk.connect" property in the config
<https://github.com/apache/samza-hello-samza/blob/master/src/main/config/wikipedia-application-local-runner.properties#L22>
to an external/shared ZK instance and starting more processors. The ZK Job
Coordinator will automatically utilize more processors as they are started.

However, if I remember correctly, the wikipedia stream is only a single
partition and can't be processed by more than one processor, so you would
probably want to change the input to a Kafka topic with at least as many
partitions as your intended processors.

More details about how standalone works can be found here:
http://samza.apache.org/startup/preview/#flexible-deployment-model

-Jake

On Wed, Oct 18, 2017 at 4:06 AM, Giridhar Addepalli <giridhar1202@gmail.com>
wrote:

> Hi,
>
> i am new to Samza.
> We are evaluating using Samza in Standalone mode.
>
> Was able to run "Hello Samza" using Zookeeper Deployment Model , on single
> machine
> http://samza.apache.org/learn/tutorials/latest/hello-samza-
> high-level-zk.html
>
> We are wondering how to run Samza Job using  Zookeeper Deployment Model
> from multiple machines.
>
> Please point to relevant documentation or suggestions.
>
> Thanks,
> Giridhar.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message