hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nick maillard <nicolas.maill...@fifty-five.com>
Subject Re: Hadoop/Hbase 0.94.2 performance what to expect
Date Sun, 28 Oct 2012 15:18:48 GMT
Hello Kevin

In the hbase-env If have only upped te heap to 3gb.
But I'll gladly share my full file.

My rowkey set up  is: 
A rowkey: eventid
A single family: events
around 20 string columns 
so in table
                  columnA: value
                  columnB: value...

The import file is a csv like valueA,ROWKEY,valueB,valueC....
it contains around 14million entries.
The Rowkey are not incremental from line to line they are randomly dispersed.
I see the different servers being written but I will check more thouroughly

If you are kind enough to check and want further info I have opened my cluster:
you can see the hbase at :
The table would be confblog_events
this should show all my parameters

If you want to see the ImportTSV you can look:
in the retired jobs there is IMportTSv that I ran;

As well I'm trying to get a feel of read and write, with bufferes import ti goes
down to 20 minutes which is acceptable.

On the same jobtracker page you will see in retired jobs the select from Hive
which is applied on the same table. The process takes around 4 minutes, off
course it is not applied on the rowkey. I'm trying to understand if this is a
decent time duration or If I am off.

Thanks a lot for your time and help.
I'm eager to understand either the error of my ways or if this is a norma set up.

View raw message