zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: improving tolerance to network failures
Date Tue, 23 Oct 2018 18:00:51 GMT
Michael,

I wouldn't characterize the current proposal as broken so much as it talks
about connection balancing rather than server balancing. Other than that, I
think I agree with what you are saying.

So we have two folks with a feeling that server balancing from the client
side is significantly better than connection balancing. I had thought that
this would be desirable to defer in the interest of code simplicity. That
may not be the right balance.

The point about hardware upgrades is a very good one.




On Tue, Oct 23, 2018 at 10:21 AM Michael Han <hanm@apache.org> wrote:

> >> Will there be a code effect?
>
> There will be - the current rebalancing algorithm will be broken if no code
> is done to StaticHostProvider.updateServerList to teach it aware of
> multiple server addresses belong to the same server. For example, currently
> if we add a new server through reconfig, the rebalance will kick in. In the
> new proposal, if we add a new address to the existing server, if no code
> change made to updateServerList, the rebalance will also kick in but it
> should not, as in this case no new real servers are added.
>
> >> My own experience is that production settings typically involve
> Zookeeper servers with very consistent hardware where this would not be an
> issue.
>
> I think this is generally true, but we should consider cases where user is
> upgrading hardware, which might take a while and during this time it would
> be ideal if ZK offer the capability of balanced client connections across
> ensemble with heterogeneous hardwares. As a user myself, I'd like to have
> this feature, especially consider it seems not hard to implement. What Alex
> proposed should work. Another approach might be to assign weights to each
> address (a single server has weight one), and this will reduce to a
> weighted random selection problem.
>
> Overall, I think this proposal has little impact on server side, most
> impact is on client side.
>
>
> On Tue, Oct 23, 2018 at 9:34 AM Ted Dunning <tdunning@apache.org> wrote:
>
> > There have been several comments on the document. I will be porting
> > discussions from the document back to the mailing list each day.
> >
> > Alex Shraer makes a good point that with the design as stated, there is
> no
> > provision for dealing with the rebalancing of client connections during
> > dynamic reconfiguration. I am very curious whether this needs to be
> > addressed in the design since it seems that if connections are
> redirected,
> > the same connection logic should apply. I suppose the text needs an
> update,
> > regardless, even if there is no effect. But is there something I missed
> > here? Will there be a code effect?
> >
> > Another comment points out that if you don't have symmetrical hardware
> for
> > the servers (i.e. more network interfaces on some), then client
> connections
> > are likely to be more numerous on servers with more network connections.
> > This is undoubtedly true.
> >
> > I have a question, however, about this. Is this situation actually
> > important enough to make the first version of this change? My own
> > experience is that production settings typically involve Zookeeper
> servers
> > with very consistent hardware where this would not be an issue.
> >
> > What experience do others have, particularly in production situations?
> >
> > On 2018/10/23 02:02:12, Ted Dunning <ted.dunning@gmail.com> wrote:
> > > ...
> > > I have started a collaborative document to work on the design approach.
> > > Once that is judged by the community to be sufficiently mature, I will
> > move
> > > it to a JIRA.
> > >
> > > That document is at
> > >
> >
> https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing
> > >
> > > The design document is currently open to the world for commenting so
> that
> > > anybody can suggest changes or ask questions. I will act as a bit of a
> > > moderator so that the document can remain completely open.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message