hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nick maillard <nicolas.maill...@fifty-five.com>
Subject Re: Hadoop/Hbase 0.94.2 performance what to expect
Date Sun, 28 Oct 2012 15:18:48 GMT
Hello Kevin

In the hbase-env If have only upped te heap to 3gb.
But I'll gladly share my full file.

My rowkey set up  is: 
A rowkey: eventid
A single family: events
around 20 string columns 
so in table
confblog_events{
     event_id:{
            event:{
                  columnA: value
                  columnB: value...
            }
     }
}

The import file is a csv like valueA,ROWKEY,valueB,valueC....
it contains around 14million entries.
The Rowkey are not incremental from line to line they are randomly dispersed.
I see the different servers being written but I will check more thouroughly
tomorrow.

If you are kind enough to check and want further info I have opened my cluster:
you can see the hbase at : 
http://91.121.69.14:60030/rs-status
The table would be confblog_events
this should show all my parameters

If you want to see the ImportTSV you can look:
http://91.121.69.14:50030/jobtracker.jsp
in the retired jobs there is IMportTSv that I ran;

As well I'm trying to get a feel of read and write, with bufferes import ti goes
down to 20 minutes which is acceptable.

On the same jobtracker page you will see in retired jobs the select from Hive
which is applied on the same table. The process takes around 4 minutes, off
course it is not applied on the rowkey. I'm trying to understand if this is a
decent time duration or If I am off.

Thanks a lot for your time and help.
I'm eager to understand either the error of my ways or if this is a norma set up.








Mime
View raw message