incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davide Simoncelli <Davide.Simonce...@neclab.eu>
Subject RE: Adapters and S4 cluster
Date Fri, 08 Jun 2012 12:26:30 GMT
Thank you a lot for your clarifications!

On Friday, June 08, 2012 11:37:35 AM Matthieu Morel wrote:
> On 6/8/12 8:47 AM, Davide Simoncelli wrote:
> > Hello Matthieu,
> > 
> > unfortunately it didn't help me. Let me do an example.
> > 
> > Suppose we have 3 nodes in the adapter cluster and 4 nodes in the S4
> > cluster (2 nodes belongs to partition 0 and others ones to partition 1).
> > There are 3 PEs: - FirstPE: it receives keyless event (it is the entry
> > point)
> > - SecondPE: it receives events from FirstPE and sends new events to
> > ThirdPE
> > - ThirdPE: it receives events from SecondPE and outputs something
> > 
> > The client injects events with the Driver which uses a TCP/IP connection
> > to talk with client IO stub of the adapter  on port 2334 (the default one
> > is GenericJsonClientStub). As I understood injected events (without key)
> > are dispatched to all keyless PEs for each PN. But what about the two
> > partitions?
> Let me clarify partitions vs nodes. When you configure the cluster, you
> define the number of partitions. When nodes are started and attached to
> the cluster, they are assigned a partition. There is only 1 partition
> per node. (I don't see how you could get "2 nodes belongs to partition 0
> and others ones to partition 1").

I thought a partition was a kind of PN container. What is the meaning of 
having a partition with just one PN?

> Since the deployment is symmetrical, all PEs are deployed on all nodes:
> there are instances of FirstPE in all nodes. Then (from
> http://docs.s4.io/manual/client_adapter.html) , [clients] send events to
> the S4 cluster. These events may either be keyed or keyless. In the
> latter case, the corresponding events are dispatched round-robin.

So if the PE is keyless just one instance per PN exists. Is it right?
 
> > An adapter sends events to S4 clusters. How do events are dispatched from
> > the client to adapters in the adapter cluster?
> You use a Driver, as in
> https://github.com/s4/twittertopiccount/blob/master/src/main/java/org/apache
> /s4/example/twittertopiccount/TwitterFeedListener.java

Yea, I use the same driver. But if there are more than one adapter, should the 
client know its address and port to connect to?

Regards

- Davide

> > When the FirstPE sends a new event, the dispatcher first chooses the
> > partition and then the PN (in both cases an hash function is used to know
> > the target). Is it right?
> Almost! The dispatcher sends to the correct partition that it gets from
> the partitioning scheme. Dispatch to the correct PE is done in the
> receiver node.
> 
> Regards,
> 
> Matthieu
> 
> > Thank you for your time
> > 
> > - Davide
> > ________________________________________
> > From: Matthieu Morel [mm@s4.io] on behalf of Matthieu Morel
> > [mmorel@apache.org] Sent: Thursday, June 07, 2012 10:37 AM
> > To: s4-user@incubator.apache.org
> > Subject: Re: Adapters and S4 cluster
> > 
> > Would this page answer your questions?
> > http://docs.s4.io/manual/client_adapter.html
> > 
> > Regards,
> > 
> > Matthieu
> > 
> > On 6/6/12 6:13 PM, Davide Simoncelli wrote:
> >> Hello,
> >> 
> >> I'm implementing an application with S4 and I would like to know how
> >> events are dispatched between adapters and nodes in the cluster when the
> >> client generates streams.
> >> 
> >> Thank you
Mime
View raw message