incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <mmo...@apache.org>
Subject Re: Adapters and S4 cluster
Date Fri, 08 Jun 2012 09:37:35 GMT
On 6/8/12 8:47 AM, Davide Simoncelli wrote:
> Hello Matthieu,
>
> unfortunately it didn't help me. Let me do an example.
>
> Suppose we have 3 nodes in the adapter cluster and 4 nodes in the S4 cluster (2 nodes
belongs to partition 0 and others ones to partition 1). There are 3 PEs:
> - FirstPE: it receives keyless event (it is the entry point)
> - SecondPE: it receives events from FirstPE and sends new events to ThirdPE
> - ThirdPE: it receives events from SecondPE and outputs something
>
> The client injects events with the Driver which uses a TCP/IP connection to talk with
client IO stub of the adapter  on port 2334 (the default one is GenericJsonClientStub).
> As I understood injected events (without key) are dispatched to all keyless PEs for each
PN. But what about the two partitions?

Let me clarify partitions vs nodes. When you configure the cluster, you 
define the number of partitions. When nodes are started and attached to 
the cluster, they are assigned a partition. There is only 1 partition 
per node. (I don't see how you could get "2 nodes belongs to partition 0 
and others ones to partition 1").

Since the deployment is symmetrical, all PEs are deployed on all nodes: 
there are instances of FirstPE in all nodes. Then (from 
http://docs.s4.io/manual/client_adapter.html) , [clients] send events to 
the S4 cluster. These events may either be keyed or keyless. In the 
latter case, the corresponding events are dispatched round-robin.
>
> An adapter sends events to S4 clusters. How do events are dispatched from the client
to adapters in the adapter cluster?

You use a Driver, as in 
https://github.com/s4/twittertopiccount/blob/master/src/main/java/org/apache/s4/example/twittertopiccount/TwitterFeedListener.java

>
> When the FirstPE sends a new event, the dispatcher first chooses the partition and then
the PN (in both cases an hash function is used to know the target). Is it right?

Almost! The dispatcher sends to the correct partition that it gets from 
the partitioning scheme. Dispatch to the correct PE is done in the 
receiver node.

Regards,

Matthieu

>
> Thank you for your time
>
> - Davide
> ________________________________________
> From: Matthieu Morel [mm@s4.io] on behalf of Matthieu Morel [mmorel@apache.org]
> Sent: Thursday, June 07, 2012 10:37 AM
> To: s4-user@incubator.apache.org
> Subject: Re: Adapters and S4 cluster
>
> Would this page answer your questions?
> http://docs.s4.io/manual/client_adapter.html
>
> Regards,
>
> Matthieu
>
> On 6/6/12 6:13 PM, Davide Simoncelli wrote:
>> Hello,
>>
>> I'm implementing an application with S4 and I would like to know how
>> events are dispatched between adapters and nodes in the cluster when the
>> client generates streams.
>>
>> Thank you
>>
>>
>


Mime
View raw message