incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <mmo...@apache.org>
Subject Re: S4 Communication Layer using ZooKeeper
Date Wed, 03 Oct 2012 07:45:57 GMT
Hi,

Actually currently there is no "retry queue" (we had a prototype 
mechanism but removed it due to some implementation issues). Therefore 
if there is a node failure, you might lose some messages until a 
failover node is reassigned to the corresponding partition, and this 
assignment is notified to sender nodes.

However, even though you might lose some events during failover, you can 
recover previous state through the checkpointing mechanism.

Regards,


Matthieu


On 10/3/12 8:37 AM, Frank Zheng wrote:
> Hi Kishore,
>
> Could you point out which configuration is related to the retry size
> queue, e.g. the name?
>
> Thanks.
> Frank
>
> On Wed, Oct 3, 2012 at 2:30 PM, kishore g <g.kishore@gmail.com
> <mailto:g.kishore@gmail.com>> wrote:
>
>     Yes, you can restart the nodes automatically that previously died.
>     But keep in mind that events may be dropped for that partition until
>     the node is restarted. I think we have some configuration on the
>     retry size queue and you can configure this based on how long it
>     would take to automatically restart the node.Make sure there is
>     enough memory on all nodes to keep the events in queue until the
>     node comes back up.
>
>     Another thing to keep in mind is if multiple nodes fail and you
>     restart them it is probably not guaranteed to pick up the partition
>     that it had picked up earlier.
>
>     On Tue, Oct 2, 2012 at 11:19 PM, Frank Zheng
>     <bearzheng2011@gmail.com <mailto:bearzheng2011@gmail.com>> wrote:
>
>         Hi Kishore,
>
>         This describes very clearly. Thank you a lot!
>
>         Now I have another question.
>         When one active node dies, the standby node tries to grab the lock.
>         What if no standby nodes are allowed? Under this assumption, is
>         it possible to restart the node automatically which dies previously?
>
>         Thanks.
>         Frank
>
>         On Wed, Oct 3, 2012 at 12:51 PM, kishore g <g.kishore@gmail.com
>         <mailto:g.kishore@gmail.com>> wrote:
>
>             At a very high level, this is how cluster management works
>
>             Each s4 cluster has a name space reserved /clustername in
>             zookeeper. There is an initial setup process where one or
>             many znodes are created under /clustername/tasks. When nodes
>             join the cluster they check if some one has already claimed
>             a task by looking at /clustername/process/, if not it grabs
>             the lock by creating an ephemeral node under
>             /clustername/process/. If all tasks are taken it becomes a
>             standby node. When any active node dies, the standby node
>             gets notified and tries to grab the lock.
>
>             We can provide more details, if you can let us know which
>             aspect of cluster management mechanism you are interested in.
>
>             Thanks,
>             Kishore G
>
>             On Tue, Oct 2, 2012 at 9:17 PM, Frank Zheng
>             <bearzheng2011@gmail.com <mailto:bearzheng2011@gmail.com>>
>             wrote:
>
>                 Hi All,
>
>                 I am exploring the cluster management mechanism and
>                 fault tolerance of S4.
>                 I saw that S4 used ZooKeeper in the communication layer.
>                 But it seems not very clear in that pater, " S4:
>                 Distributed Stream Computing Platform".
>                 I tried to search the reference "[15] Communication
>                 layer using ZooKeeper, Yahoo! Inc. Tech. Rep., 2009",
>                 but it is not available.
>                 Could anyone introduce me the role of ZooKeeper in S4,
>                 and the cluster management mechanism in detail?
>
>                 Thanks.
>
>                 Sincerely,
>                 Frank
>
>
>
>
>
>
>
>         --
>         Sincerely,
>         Zheng Yu
>         Mobile:  (852) 60670059
>         Email: bearzheng2011@gmail.com <mailto:bearzheng2011@gmail.com>
>
>
>
>
>
>
>
> --
> Sincerely,
> Zheng Yu
> Mobile:  (852) 60670059
> Email: bearzheng2011@gmail.com <mailto:bearzheng2011@gmail.com>
>
>
>


Mime
View raw message