hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: HBase Cyclic Replication Issue: some data are missing in the replication for intensive write
Date Fri, 20 Apr 2012 22:43:18 GMT
Jerry:
Currently TestReplicationPeer and TestReplication don't involve load
balancing scenario.
If you can write a test where load balancer re-assigns some regions, that
would help us pinpoint the problem.

Cheers

On Fri, Apr 20, 2012 at 12:34 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:

> Hi Jerry,
>
> which version of HBase are you using?
>
> You are not using cyclic backup, that needs >2 clusters. I assume you're
> just replicating from one cluster to another, right?
>
> There is never data loss in Cluster A?
>
> -- Lars
>
>
> ----- Original Message -----
> From: Jerry Lam <chilinglam@gmail.com>
> To: user@hbase.apache.org
> Cc:
> Sent: Friday, April 20, 2012 5:38 AM
> Subject: HBase Cyclic Replication Issue: some data are missing in the
> replication for intensive write
>
> Hi HBase community:
>
> We have been testing cyclic replication for 1 week. The basic
> functionality seems to work as described in the document however when we
> started to increase the write workload, the replication starts to miss data
> (i.e. some data are not replicated to the other cluster). We have narrowed
> down to a scenario that we can reproduce the problem quite consistently and
> here it is:
>
> -----------------------------
> Setup:
> - We have setup 2 clusters (cluster A and cluster B)with identical size in
> terms of number of nodes and configuration, 3 regionservers sit on top of 3
> datanodes.
> - Cyclic replication is enabled.
>
> - We use YCSB to generate load to hbase the workload is very similar to
> workloada:
>
> recordcount=200000
> operationcount=200000
> workload=com.yahoo.ycsb.workloads.CoreWorkload
> fieldcount=1
> fieldlength=25000
>
> readallfields=true
> writeallfields=true
>
> readproportion=0
> updateproportion=1
> scanproportion=0
> insertproportion=0
>
> requestdistribution=uniform
>
> - Records are inserted into Cluster A. After the benchmark is done and
> wait until all data are replicated to Cluster B, we used verifyrep
> mapreduce job for validation.
> - Data are deleted from both table (truncate 'tablename') before a new
> experiment is started.
>
> Scenario:
> when we increase the number of threads until it max out the throughput of
> the cluster, we saw some data are missing in Cluster B (total count !=
> 200000) although cluster A clearly has them all. This happens even though
> we disabled region splitting in both clusters (it happens more often when
> region splits occur). To further having more control of what is happening,
> we then decided to disable the load balancer so the region (which is
> responsible for the replicating data) will not relocate to other
> regionserver during the benchmark. The situation improves a lot. We don't
> see any missing data in 5 continuous runs. Finally, we decided to move the
> region around from a regionserver to another regionserver during the
> benchmark to see if the problem will reappear and it did.
>
> We believe that the issue could be related to region splitting and load
> balancing during intensive write, the hbase replication strategy hasn't yet
> cover those corner cases.
>
> Can someone take a look of it and suggest some ways to workaround this?
>
> Thanks~
>
> Jerry
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message