hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From n keywal <nkey...@gmail.com>
Subject Re: Hbase Assignments in trunk.
Date Tue, 11 Sep 2012 07:52:53 GMT
region assignment in ZK could be interesting. + having the regionserver
state available. This would require some work in ZK I fear (ZOOKEEPER-1147).

However, persisting data in ZK is dangerous: this leads to have the cluster
state shared in two references, making the whole thing complicated to
manage (I'm thinking about snapshots for example). It should be possible to
restart the cluster with an empty ZK. The single persisting store being
HBase/HDFS.

And making 3.4+ mandatory for 0.98 seems a good thing to do as well :-).

On Tue, Sep 11, 2012 at 4:45 AM, Enis Söztutar <enis@hortonworks.com> wrote:

> +1 on rethinking the assignment + splitting code paths, and using zk as a
> transactional database. Just my 2 cents w/o spending a lot of time on the
> details, but maybe we should stop keeping master and RS in memory metadata,
> but keep region-assignments in zk, and HM and RS just keep a consistent
> in-memory cache.
>
> Enis
>
> On Mon, Sep 10, 2012 at 3:29 PM, lars hofhansl <lhofhansl@yahoo.com>
> wrote:
>
> > I've been saying a while ago that we should require ZK 3.4.x for 0.96+.
> >
> > Distributed consensus without a "transaction" option always rang a bit
> > weird to me.
> >
> > Maybe switch in 0.98+?
> >
> > -- Lars
> >
> >
> > ----- Original Message -----
> > From: n keywal <nkeywal@gmail.com>
> > To: dev@hbase.apache.org
> > Cc:
> > Sent: Thursday, September 6, 2012 12:53 AM
> > Subject: Re: Hbase Assignments in trunk.
> >
> > On the Async vs. sync: there are 3 different ways to write multiple
> znodes
> > in ZK, and huge differences in the performances between them:
> >
> > 1) for loop sync
> > 2) for loop async
> > 3) multi
> >
> > Async will be 20 to 100 times faster than sync. multi will be 2 to 4
> times
> > faster than async (that is, 80 to 400 times faster than sync).
> >
> > Multi was not available before ZK 3.4. It has several obvious advantages
> > over async imho: it's faster, it's synchronous and it's a transaction.
> That
> > simplifies the user code usually.
> >
> > It has other advantages:
> > - async and sync will typically send 1 or more packet per znode (naggle
> is
> > not activated iirc), while there will be only a few packets for all the
> > znodes with multi
> > - you can expect async to write multiple times on the disk, while multi
> > should write only once. This is usually better for i/o performances.
> >
> > On a serious recovery situation, with all the regions moving all other
> the
> > place, saving disk and network i/o for ZooKeeper is important.
> >
> > Disadvantage: it's new.
> >
> > On Thu, Sep 6, 2012 at 7:49 AM, Stack <stack@duboce.net> wrote:
> >
> > > On Wed, Sep 5, 2012 at 5:17 PM, Jonathan Hsieh <jon@cloudera.com>
> wrote:
> > > > Here's a link to the pdf/picture.
> > > >
> > > > http://people.apache.org/~jmhsieh/hbase/120905-hbase-assignment.pdf
> > > >
> > >
> > > Pretty picture.  Not a pretty story.
> > >
> > > What you thinking?
> > >
> > > St.Ack
> > >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message