hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Yalowitz <neilyalow...@gmail.com>
Subject Re: replication - how do I know the status?
Date Thu, 13 Sep 2012 21:28:46 GMT
This is a great answer, I can see that particular ganglia metric sharply
increased when the issue began.  Thanks much.

One followup question:

Can a distressed slave cluster cause performance issues on the master
cluster?  It appears our performance problem was occurring on the slave
peer, but the master cluster almost crashed as well.  I'm trying to
determine if that was a coincidence or something more...


On Thu, Sep 13, 2012 at 5:18 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> The best metric at the moment is hbase.replication.sizeOfLogQueue
> published through JMX. If your have Ganglia, opentsdb or Cacti you can
> graph how many logs per server need to be replicated and then you'll
> have a good idea of how much data needs to be replicated.
> If it goes up to more than 2 per server for a few minutes, you know
> you are either slowing down or someone is inserting a lot of data.
> J-D
> On Thu, Sep 13, 2012 at 1:18 PM, Neil Yalowitz <neilyalowitz@gmail.com>
> wrote:
> > Hi all,
> >
> > I'm using HBase replication between two clusters running CDH3u3 and I
> > recently noticed that a replicated column family was "lagging" by more
> than
> > a day... that is, it required more than 24 hours for a Put to replicate
> > from master to slave.  The root cause of the lag appears to be swapping
> and
> > other bad behavior.
> >
> > The real question I have is this: how do I know the state of replication
> at
> > any given time?  Does a large amount of data in /hbase/.logs indicate
> that
> > replication is falling behind?  What about /hbase/.oldlogs which seems to
> > grow forever?  What red flags should I look for to tell me that there is
> a
> > problem with replication?
> >
> >
> > Neil Yalowitz
> > neilyalowitz@gmail.com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message