incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <mmo...@apache.org>
Subject Re: S4 piper questions
Date Tue, 07 Aug 2012 15:12:14 GMT
On 8/6/12 9:40 AM, Davide Simoncelli wrote:
> Hello,
>
> I have few questions related to new S4 piper version.
>
> - When I create a KeyFinder, what should the get method return? A list of key names or
values?

The KeyFinder implements how to determine the key from a given event. It 
returns a list of values extracted from the event. In simple cases you 
use 1 value.

Example:
Event E1
- field1 "key1"
- field2 "blah I'm not part of the key"

--> you'll probably implement a KeyFinder that returns "key1" from E1

>
> - What I want to do is dispatching every event on key values (even those injected by
the adapter). Should I declare a key finder instance on the input stream (with method createInputStream())?
> And should I declare another one in the AdapterApp (with method setKeyFinder())?

The partitioning in S4 is specified on the sender's side.

By default, if you use an adaptor and you don't specify a keyfinder, 
events are sent to the remote cluster in a round robin fashion.

If you want to partition directly from the adaptor, use a keyfinder when 
you create the output stream (see the corresponding overloaded method).


>
> - What is the meaning of singleton PE?

It is eagerly created, there is only 1 instance, regardless of the key. 
It can be interesting for producer applications, that don't receive 
events, but we may refine the usage. Thanks for pointing this out.

Note that in some places in the examples (e.g. twitter example) we 
specify singleton for keyless PEs, but that is actually not needed, and 
we'll have to update the code.


Hope this helps,

Matthieu


Mime
View raw message