hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zaharije Pasalic <pasalic.zahar...@gmail.com>
Subject Configuration limits for hbase and hadoop ...
Date Mon, 18 Jan 2010 16:47:08 GMT
Hi

we are using 7nodes HBase configuration + 1 master. Each node/master
have 8GB of memory with 4core cpu. One master is used for hadoop and
also as hbase master. Also, 7 nodes are shared for hadoop and hbase.
In configuration files we set 2GB of memory for hbase and additional
2GB for hadoop. HDFS has 1.6TB of free space.

Now we are trying to import 50 millions rows of data. Each row have
100 columns (in reality we will have sparsely populated table, but now
we are testing worst-case scenario). We are having 50 million records
encoded in about 100 CSV files stored in HDFS.

Importing process is really simple one: small map reduce program will
read CSV file, split lines and insert it into table (only Map, no
Reduce parts). We are using default hadoop configuration (on 7 nodes
we can run 14 maps). Also we are using 32MB for writeBufferSize on
HBase and also we set setWriteToWAL to false.

At the beginning everything looks fine, but after ~33 millions of
records we are encounter strange behavior of HBase.

Firstly one of nodes where META table resides have high load. Status
web page shows ~1700 requests on that node even if we are not running
any MapReduce (0 request on other nodes). Also, i do not see any
activity in log files on that node. Here is the last couple of lines
from log on that node:

2010-01-18 14:46:26,666 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
profiles2,1a2e1b7a-a43e-4e4f-9f84-40b4662cc4e0,1263825424277
2010-01-18 14:46:26,667 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
profiles2,1a2e1b7a-a43e-4e4f-9f84-40b4662cc4e0,1263825424277
2010-01-18 14:46:27,441 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Closed
profiles2,1a2e1b7a-a43e-4e4f-9f84-40b4662cc4e0,1263825424277
2010-01-18 14:47:38,773 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,0d9deec6-b6df-43a3-ab94-685dade5af61,1263825533141 in
2mins, 18sec
2010-01-18 14:47:38,773 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,0dbd64eb-3e59-4b35-af4a-92a83a1e1858,1263825533141
2010-01-18 14:49:01,881 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,0dbd64eb-3e59-4b35-af4a-92a83a1e1858,1263825533141 in
1mins, 23sec
2010-01-18 14:49:01,883 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,3f726ebf-2ec8-43a0-bd50-d40bec1776d4,1263825595669
2010-01-18 14:49:52,186 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,3f726ebf-2ec8-43a0-bd50-d40bec1776d4,1263825595669 in
50sec
2010-01-18 14:49:52,186 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,3f5303ee-4729-4ab9-bfd6-3c319d429c4f,1263825595669
2010-01-18 14:50:57,328 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,3f5303ee-4729-4ab9-bfd6-3c319d429c4f,1263825595669 in
1mins, 5sec
2010-01-18 14:50:57,328 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,8f50a54d-e8d5-4dec-84a4-05a468fbf8e1,1263825624515
2010-01-18 14:51:24,508 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,8f50a54d-e8d5-4dec-84a4-05a468fbf8e1,1263825624515 in
27sec
2010-01-18 14:51:24,508 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,8f309cdb-eb70-49e0-90d4-d2510e38ae51,1263825624515
2010-01-18 14:52:19,736 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,8f309cdb-eb70-49e0-90d4-d2510e38ae51,1263825624515 in
55sec
2010-01-18 14:52:19,736 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,84bd729a-c64b-4d75-8189-e828dbf06797,1263825639973
2010-01-18 14:53:44,053 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,84bd729a-c64b-4d75-8189-e828dbf06797,1263825639973 in
1mins, 24sec
2010-01-18 14:53:44,053 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,84dcfe35-e488-4eec-99d8-83be178f1b22,1263825639973
2010-01-18 14:55:09,999 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,84dcfe35-e488-4eec-99d8-83be178f1b22,1263825639973 in
1mins, 25sec
2010-01-18 14:55:09,999 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,a6252b0c-b2b1-4bd2-acf4-522065a2a3be,1263825653683
2010-01-18 14:56:22,364 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,a6252b0c-b2b1-4bd2-acf4-522065a2a3be,1263825653683 in
1mins, 12sec
2010-01-18 14:56:22,364 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,a644b61a-f2c0-4855-ad99-1e6ab2d82e61,1263825653683
2010-01-18 14:57:41,518 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,a644b61a-f2c0-4855-ad99-1e6ab2d82e61,1263825653683 in
1mins, 19sec

second manifestation is that i can create new empty table and start
importing data normaly, but if i try to import more data into same
table (now having ~33 millions) i'm having really bad performance and
hbase status page does not work at all (will not load into browser).

Currently ~33 millions of records uses 800GB of disk and i'm having
1.1TB free HDFS storage.

So my questions is: what i'm doing wrong? Is current cluster good
enough to support 50millions records or my current 33 millions is
limit on current configuration? Any hints. Also, I'm getting about 800
inserts per second, is this slow?   Any hint is appreciated.

Best
Zaharije

Mime
View raw message