hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: HBase Replication problems
Date Mon, 13 Dec 2010 19:28:32 GMT
Hi Nathaniel,

Thanks for trying out replication, let's make it work for you.

So on the master-side there's 2 lines that are important to make sure
that replication works, first it has to say:

Replicating x

Where x is the number of edits it's going to ship, and then

Replicated in total: y

Where y is the total number it replicated. Seeing the second line
means that replication was successful, at least from the master point
of view.

On the slave, one node should have:

Total replicated: z

And that z is the number of edits that that region server applied on
it's cluster. It could be on any region server, since the sink for
replication is chose at random.

Do you see those? Any exceptions around those logs apart from EOFs?

Thx,

J-D

On Mon, Dec 13, 2010 at 10:52 AM, Nathaniel Cook
<nathanielc@qualtrics.com> wrote:
> Hi,
>
> I am trying to setup replication for my HBase clusters. I have two
> small clusters for testing each with 4 machines. The setup for the two
> clusters is identical. Each machine runs a DataNode, and
> HRegionServer. Three of the machines run a ZK peer and one machine
> runs the HMaster and NameNode. The cluster master machines have
> hostnames (ds1,ds2 ...) and the slave cluster is (bk1, bk2 ...). I set
> the replication  scope to 1 for my test table column families and set
> the hbase.replication property to true for both clusters. Next I ran
> the add_peer.rb script with the following command on the ds1 machine:
>
> hbase org.jruby.Main /usr/lib/hbase/bin/replication/add_peer.rb
> ds1:2181:/hbase bk1:2181:/hbase
>
> After the script finishes ZK for the master cluster has the
> replication znode and children of peers, master, and state. The slave
> ZK didn't have a replication znode. I fixed that problem by rerunning
> the script on the bk1 machine and commenting out the code to write to
> the master ZK. Now the slave ZK has the /hbase/replication/master
> znode with data (ds1:2181:/hbase). Everthing looked to be configured
> correctly. I restarted the clusters. The logs of the master
> regionservers stated:
>
> This cluster (ds1:2181:/hbase) is a master for replication, compared
> with (ds1:2181:/hbase)
>
> The logs on the slave cluster stated:
>
> This cluster (bk1:2181:/hbase) is a slave for replication, compared
> with (ds1:2181:/hbase)
>
> Using the hbase shell I put a row into the test table.
>
> The regionserver for that table had a log statement like:
>
> Going to report log #192.168.1.166%3A60020.1291757445179 for position
> 15828 in hdfs://ds1:9000/hbase/.logs/ds1.internal,60020,1291757445059/192.168.1.166
> <http://192.168.1.166/>%3A60020.1291757445179
>
> (192.168.1.166 is ds1)
>
> I wait and even after several minutes the row still does not appear in
> the slave cluster table.
>
> Any help with what the problem might be is greatly appreciated.
>
> Both clusters are using a CDH3b3. The HBase version is exactly
> 0.89.20100924+28.
>
> -Nathaniel Cook
>

Mime
View raw message