hadoop-zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henry Robinson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (ZOOKEEPER-368) Observers
Date Wed, 08 Jul 2009 09:56:15 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728593#action_12728593
] 

Henry Robinson commented on ZOOKEEPER-368:
------------------------------------------

(Reposting last bit of conversation from ZK-107, more appropriate to this jira)

_"sorry to jump in late here. rather than adding the inform, why don't we just send the PROPOSE
and COMMIT to the Observer as normal, and just make the Observer not send ACKs? That way we
change as little code as possible with minimum overhead. It also makes switching from Observer
to Follower as easy as turning on the ACKs. I also think Observers should be able to issue
proposals. One use case for observers are remote data centers that basically proxy clients
that connect to ZooKeeper. This means an Observer is just a Follower that doesn't vote (ACK)."_

That's definitely one way to do it. The other side to that argument is to keep the message
complexity down, especially if we can envisage use cases with lots of Observers. A connection
to a remote Observer might be more likely to violate the FIFO requirement of ZK connections;
having a single-message protocol makes it easier to deal with this case (not a correctness
issue of Observers, just annoying if PROPOSALs arrive after COMMITs for some reason). I think
that's a marginal issue though. My preference is for INFORM messages as this completely separates
Observer logic from Follower logic and doesn't add much complexity to the code.

The Observer also has to take care not to participate in leader elections. I think Observers
also need to announce themselves as such to the Leader, to enable the case where a Follower
wishes to connect as an Observer temporarily (otherwise the Leader will think the Observer
to be a Follower and use it as part of a quorum). Also if the leader can distinguish between
followers and observers then it can treat both differently (e.g. through batching multiple
INFORMs or allowing observers to lag by prioritising follower traffic).

Keeping Observers as special-case Followers would simplify the code for the observers patch
(I've got a new version nearly ready to submit, just fixing some tests). However, it would
mean that Observers are harder to customise - for example, there's no persistence requirement
for an Observer and so some of the RequestProcessors can be optionally removed or replaced
by something that only asynchronously writes to disk. Keeping them lightweight has been a
goal. My feeling was that I was introducing too many 'if (amObserver()) {...}' branches to
an already fairly hard to follow bit of code (in particular Follower.followLeader). Breaking
the functionality into two separate classes seems to have made things cleaner.

Regarding Observers being able to issue proposals; I don't have a problem with that, should
be reasonably easy to add. 



> Observers
> ---------
>
>                 Key: ZOOKEEPER-368
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: quorum
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Henry Robinson
>         Attachments: ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Currently, all servers of an ensemble participate actively in reaching agreement on the
order of ZooKeeper transactions. That is, all followers receive proposals, acknowledge them,
and receive commit messages from the leader. A leader issues commit messages once it receives
acknowledgments from a quorum of followers. For cross-colo operation, it would be useful to
have a third role: observer. Using Paxos terminology, observers are similar to learners. An
observer does not participate actively in the agreement step of the atomic broadcast protocol.
Instead, it only commits proposals that have been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding commit messages
not only to followers but also to observers, and have observers applying transactions according
to the order followers agreed upon. In the current implementation of the protocol, however,
commit messages do not carry their corresponding transaction payload because all servers different
from the leader are followers and followers receive such a payload first through a proposal
message. Just forwarding commit messages as they currently are to an observer consequently
is not sufficient. We have a couple of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the protocol implementation,
but it increases traffic slightly. The performance impact due to such an increase might be
insignificant, though.
> For scalability purposes, we may consider having followers also forwarding commit messages
to observers. With this option, observers can connect to followers, and receive messages from
followers. This choice is important to avoid increasing the load on the leader with the number
of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message