hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Einspanjer <deinspan...@mozilla.com>
Subject Suggested config changes to be made
Date Tue, 08 Jun 2010 00:52:01 GMT
  For Socorro, we currently have a 15 node HBase 0.20.3 cluster.
The hardware is dual hyperthreaded quads with 24GB of RAM (RS JVM is 
allocated 8GB).
HDFS Health reports that we are currently using 20TB out of 60TB. 
(Storage is only HBase related at the moment.)
hadoop dfs -dus /hbase reports about 7TB of usage.

In production at the moment, we have a single HBase table, 
crash_reports.  The table has a poorly chosen rowkey format that starts 
with the current date, so all inserts currently go into a single 
region.  In our next release, the rowkey will be salted to prevent this 

We are currently inserting 10 to 20 new records per second.  In our next 
Socorro release, that number will be multiplied by 5 due to inserts into 
different index tables.

At the moment, we have 40k regions on our 15 servers.  There were some 
questions raised on the #hbase IRC channel about different settings.  
I'm posting this e-mail to collect the suggestions for changes we should 
make during our scheduled upgrade to 0.20.5 in less than two weeks.

Currently, our region.max.size is the default 256.  It was suggested 
that this should be at least 1GB.  What are the steps to ensure that we 
have the right size for the new tables we'll create during our upgrade, 
and what we should do about our existing table?

This output indicates that block cache is disabled on -ROOT-.  It sounds 
like it was recommended to enable this.  Is it just an alter table or is 
there anything else that needs to be done?

$ hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Version: 0.20.3, rUnknown, Tue Feb  2 08:32:37 PST 2010
hbase(main):001:0> scan '-ROOT-'
ROW                          COLUMN+CELL
  .META.,,1                   column=info:regioninfo, 
timestamp=1259618213386, value=REGION => {NAME => '.META.
                              ,,1', STARTKEY => '', ENDKEY => '', 
ENCODED => 1028785192, TABLE => {{NAME => '.M
                              ETA.', IS_META => 'true', 
                              istorian', VERSIONS => '2147483647', 
COMPRESSION => 'NONE', TTL => '604800', BLOC
                              KSIZE => '8192', IN_MEMORY => 'false', 
BLOCKCACHE => 'false'}, {NAME => 'info', V
                              ERSIONS => '10', COMPRESSION => 'NONE', 
TTL => '2147483647', BLOCKSIZE => '8192',
                               IN_MEMORY => 'false', BLOCKCACHE => 
  .META.,,1                   column=info:server, 
timestamp=1275952945634, value=
  .META.,,1                   column=info:serverstartcode, 
timestamp=1275952945634, value=1275952942699
1 row(s) in 0.0780 seconds

{NAME => 'crash_reports', FAMILIES => [{NAME => 'meta_data', COMPRESSION 
=> 'LZO', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', 
IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'processed_data', 
VERSIONS => '3', COMPRESSION => 'LZO', TTL => '2147483647', BLOCKSIZE => 
'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 
'raw_data', COMPRESSION => 'LZO', VERSIONS => '3', TTL => '2147483647', 
BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}

Is there any other information I should provide that could lead to other 
important config changes we should make on this upgrade?

Daniel Einspanjer
Mozilla Corporation

View raw message