hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Questions about HBase Cluster Replication
Date Thu, 03 Mar 2011 22:04:14 GMT
Here it is: https://issues.apache.org/jira/browse/HBASE-3597

I think we'll have the opportunity to test out cluster replication and provide 
feedback soon.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Jean-Daniel Cryans <jdcryans@apache.org>
> To: user@hbase.apache.org
> Sent: Thu, March 3, 2011 3:41:04 PM
> Subject: Re: Questions about HBase Cluster Replication
> 
> Yep, it just occurred to me while answering you :) I'm the only dev
> who  worked on the replication stuff, any contribution or just testing
> out the  software is really appreciated.
> 
> J-D
> 
> On Thu, Mar 3, 2011 at 12:10  PM, Otis Gospodnetic
> <otis_gospodnetic@yahoo.com>  wrote:
> > Aha, so the fact that the age doesn't change when replication  keeps retrying 
>is
> > really a bug?
> >
> >  Otis
> >
> >
> >
> >
> > ----- Original Message  ----
> >> From: Jean-Daniel Cryans <jdcryans@apache.org>
> >> To: user@hbase.apache.org
> >> Sent:  Thu, March 3, 2011 2:17:08 PM
> >> Subject: Re: Questions about HBase  Cluster Replication
> >>
> >> No it's the age in  ms:
> >>
> >> ageOfLastAppliedOp.set(System.currentTimeMillis()  -  timestamp);
> >>
> >> And the timestamp is the one given to the  HLogEdit, not the  timestamp
> >> of the cell.
> >>
> >>  J-D
> >>
> >> On Thu, Mar 3, 2011 at 11:13 AM,  Otis  Gospodnetic
> >> <otis_gospodnetic@yahoo.com>   wrote:
> >> > Is that really the *age* really the *timestamp* of last   successful log
> >>shipment?
> >> > If so, one could calculate  the real age with  age = now() -
> >> >  ageOfLastShippedOnWhichIsReallyTimestamp .  And that would  be useful
 
to
> >>have.
> >> >
> >> > Otis
> >> >  ----
> >> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
> >> > Lucene  ecosystem search :: http://search-lucene.com/
> >> >
> >> >
> >>  >
> >> > ----- Original  Message ----
> >> >> From:  Jean-Daniel Cryans <jdcryans@apache.org>
> >>  >> To: user@hbase.apache.org
> >>  >> Sent:  Thu, March 3, 2011 12:21:09 PM
> >> >> Subject: Re:  Questions about HBase  Cluster Replication
> >> >>
> >>  >> It's a work in progress, that  information is currently published
 by
> >> >> every  region server in the  master cluster (since it's  push
> >> >> replication, not pull)  through JMX  under the  name
> >> >> "ageOfLastShippedOp". It's really not perfect    though, since if it
> >> >> fails to replicate and starts retrying  then the  age won't  change
but
> >> >> the actual lag will go up.  Also it will have  to be revisited when
 we
> >> >> add multiple  slaves since you don't really  want to publish the  same
> >> >>  metric for multiple slaves... it really  wouldn't  work.
> >>  >>
> >> >> J-D
> >> >>
> >> >> On  Thu, Mar  3, 2011 at 9:10 AM, Bill Graham <billgraham@gmail.com>
 
> wrote:
> >> >> >  Actually, how far behind replication is  w.r.t. edit  logs is

>different
> >> >> >  than how out of sync  they are, but you get  the idea.
> >> >> >
> >> >>  > On Thu, Mar  3, 2011 at 9:07 AM,  Bill Graham <billgraham@gmail.com>
> >>  wrote:
> >> >> >> One more question for the FAQ:
> >>  >>  >>
> >> >> >> 6. Is  it possible for an admin  to tell just how  out of
sync the 
two
> >> >> >>  clusters  are? Something like  Seconds_Behind_Master in MySQL's
SHOW
> >> >>  >>  SLAVE  STATUS?
> >> >> >>
> >> >>  >>
> >> >> >> On Wed,  Mar 2, 2011 at 9:32  PM,  Jean-Daniel Cryans
> >><jdcryans@apache.org>
> >>  >>wrote:
> >> >>  >>> Although, I would add that  this feature is still  experimental
so 
> who
> >>knows
> >>  >>:)
> >> >> >>>
> >> >> >>> I   think the worst  that happened to us was that replication
was  
>broken
> >> >>  >>> (see the  jira where if the master  loses it's zk session
with the
> >>slave
> >> >>  >>> zk  ensemble, it requires a HBase restart on the  master
side) for  
>a
> >> few
> >> >> >>> days because of maintenance  of  the link between the
two 
> datacenters
> >> >> >>>  which took more  than a minute. When we finally did 
restart the  
>master
> >> >> >>>  cluster, it had to process about 2TBs  of  HLogs... those
ICVs can
> >> >>  >>> really generate a  lot of  data!
> >> >>  >>>
> >> >>  >>> J-D
> >> >> >>>
> >> >>   >>> On  Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel  Cryans
> >><jdcryans@apache.org>
> >>  >>wrote:
> >> >>  >>>>> 5. If one is adding  replication on the  *production*
 Master
> > cluster,
> >>  >>what's the
> >> >> >>>>> worst  thing that   can happen to this Master
cluster?  Nothing 
>scary
> >>other
> >>  >>than
> >> >> >>>>> changing configs +   interruption during a restart?
 (which is
> >>currently
> >>  >>still  bad
> >> >> >>>>> because of region    assignments?)
> >> >> >>>>>
> >> >>   >>>>
> >> >> >>>>  The replication code is  pretty  much encapsulated
from the rest 
of
> >> the
> >> >>  >>>> region  server code, it won't mess with your Puts
or  change  your
> >> >>  >>>> birthday  date.
> >> >>  >>>>
> >> >>  >>>> With 0.90 the regions  are  reassigned where they
were before,  so
> > it's
> >> >>  >>>> really just the  block cache that gets  screwed.
> >>  >> >>>>
> >> >> >>>>    J-D
> >> >> >>>>
> >> >>  >>>
> >> >>  >>
> >> >>  >
> >> >>
> >> >
> >>
> >
> 

Mime
View raw message