hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien HARDY <dha...@figarocms.fr>
Subject Hbase CopyTable timeout on scanner
Date Mon, 07 May 2012 12:22:55 GMT
Hello,

I try to copy a table from on cluster to another.

source is a 2 nodes cluster 16cpu / 32GoRAM (hadoop001, hadoop002).
destination is a 3 nodes cluster 16cpu /64GoRAM (hbase01, hbase02, hbase04).
nodes are all implementing datanode, regionserver,masterserver and
zookeeper of CDH3u3
region size is 2G on source and 4G on destination
Max Heap is 8G on region server
There is 421 regions on source (80% is empty because of TTL and time
based rowkey)
about 400k rows of syslogs (1 or 2ko).

My problem is that when I performe au MR job with
hbase org.apache.hadoop.hbase.mapreduce.CopyTable
--peer.adr=hbase01,hbase02,hbase04:2181:/hbase <table>
on the source cluster. Nodes seems not doing so much (load average is
low) and load on regionservers is unconstant (going from 0 to 60k
request/sec with bursts)

12/05/07 12:08:17 INFO mapred.JobClient: Task Id :
attempt_201204241409_0005_m_000343_0, Status : FAILED
org.apache.hadoop.hbase.client.ScannerTimeoutException: 184008ms passed
since the last invocation, timeout is currently set to 120000
    at
org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1179)
    at
org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:133)
    at
org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:142)
    at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456)
    at
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j

After some retries the tasks seems to succeed but it's very long.

parameters that I tryied to change on submiting job:
hbase.client.scanner.caching is 2000
hbase.regionserver.lease.period is 120000

What other parameters could be helpfull to maximise the use of the
cluster resources for this simple copy.
I can't see where is bootleneck.

Thank you.

-- 
Damien


Mime
View raw message