hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jl...@streamy.com>
Subject Re: HBase-0.20.0 randomRead
Date Wed, 19 Aug 2009 16:50:18 GMT
Murali,

Which version of HBase are you running?

There was a fix that was just committed a few days ago for a bug that 
manifested as null/empty HRI.

It has been fixed in RC2, so I recommend upgrading to that and trying 
your upload again.

JG

Murali Krishna. P wrote:
> Thanks for the clarification. I changed the ROW_LENGTH as you suggested and used SequenceWrite
+ randomRead combination to benchmark. Initial result was impressive, eventhough I would like
to have the last column improved.
> 
> randomRead
> =========
>                 -nclients(--rows)               5 (10000)           50(10000)       
        100(10000)                 1000 (10000)
> totalrows                       
> 800k                                             0.4ms                 3.5ms        
                6.5ms                            55ms
> 2.3m                                             0.45ms               3.5ms         
               6.6ms                            56ms
> 
>  Only change in the config was that the handler count increased to 1000. I think there
will be some parameters which can be tweaked to improve this further?
> 
> My goal is get test it for 10million rows with this  box. For some reason the sequenceWrite
job with 5000000row + 2 clients failed,with the following exception:-
> 
> 09/08/19 00:34:07 INFO mapred.LocalJobRunner: 2000000/2050000/2500000
> 09/08/19 00:50:38 WARN mapred.LocalJobRunner: job_local_0001
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server
Some server for region , row '0002076131', but failed after 11 attempts.
> Exceptions:
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java..io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> java.io.IOException: HRegionInfo was null or empty in .META.
> 
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocationForRowWithRetries(HConnectionManager.java:995)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1064)
>         at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:584)
>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:450)
>         at org.evaluation.hbase.PerformanceEvaluation$SequentialWriteTest.testRow(PerformanceEvaluation.java:736)
>         at org.evaluation.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:571)
>         at org.evaluation.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:804)
>         at org.evaluation.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:350)
>         at org.evaluation.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:326)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:518)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)
> 
> 
>  
> 
> From region server log:-
> 2009-08-19 00:47:22,740 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@1e458ae5
> 2009-08-19 00:48:22,741 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@4b28029c
> 2009-08-19 00:49:22,743 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@33fb11e0
> 2009-08-19 00:50:22,745 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@6ceccc3b
> 2009-08-19 00:51:22,746 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@44a3b5b4
> 
> Thanks,
> Murali Krishna
> 
> 
> 
> 
> ________________________________
> From: Jonathan Gray <jlist@streamy.com>
> To: hbase-user@hadoop.apache.org
> Sent: Wednesday, 19 August, 2009 12:26:55 AM
> Subject: Re: HBase-0.20.0 randomRead
> 
> With all that memory, you're likely seeing such good performance because 
> of filesystem caching.  As you say, 2ms is extraordinarily fast for a 
> disk read, but since your rows are relatively small, you are loading up 
> all that data into memory (not only the fs cache, but also hbase's block 
> cache which makes it even faster).
> 
> JG
> 
> Jean-Daniel Cryans wrote:
>> Well it seems there's something wrong with the way you modified PE. It
>> is not really testing your table unless the row keys are built the
>> same way as TestTable is, to me it seems that you are testing on only
>> 20000 rows so caching is easy. A better test would just be to use PE
>> the way it currently is but with ROW_LENGTH = 4k.
>>
>> WRT Jetty, make sure you optimized it with
>> http://jetty.mortbay.org/jetty5/doc/optimization.html
>>
>> J-D
>>
>> On Tue, Aug 18, 2009 at 12:08 PM, Murali Krishna.
>> P<muralikpbhat@yahoo.com> wrote:
>>> Ahh, mistake, I just took it as seconds.
>>>
>>> Now I wonder whether it can really do that fast ?? wont it take atleast 2ms for
disk read? ( I have given 8G heapspace for RegionServer, is it caching so much?). Has anyone
seen these kind of numbers ?
>>>
>>>
>>> Actually, my initial problem was that I have a jetty infront of this hbase to
serve this 4k value and when bench marked, it took 200+milliseconds for each record with 100
clients. That is why decided to benchmark without jetty first.
>>>
>>> Thanks,
>>> Murali Krishna
>>>
>>>
>>>
>>>
>>> ________________________________
>>> From: Jean-Daniel Cryans <jdcryans@apache.org>
>>> To: hbase-user@hadoop.apache.org
>>> Sent: Tuesday, 18 August, 2009 9:13:40 PM
>>> Subject: Re: HBase-0.20.0 randomRead
>>>
>>> Murali,
>>>
>>> I'm not reading the same thing as you.
>>>
>>> client-0 Finished randomRead in 2867ms at offset 0 for 10000 rows
>>>
>>> That means 2867 / 10000 = 0.2867ms per row. It's kinda fast.
>>>
>>> J-D
>>>
>>> On Tue, Aug 18, 2009 at 11:35 AM, Murali Krishna.
>>> P<muralikpbhat@yahoo.com> wrote:
>>>> Hi all,
>>>>  (Saw a related thread on performance, but starting a different one because
my setup is slightly different).
>>>>
>>>> I have an one node setup with hbase-0..20(alpha). It has around 11million
rows with ~250 regions. Each row with ~20 bytes sized key and ~4k sized value.
>>>> Since my primary concern is randomRead, modified the performanceEvaluation
code to read from this particular table. The randomRead test gave following result.
>>>>
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-1 Finished randomRead
in 2813ms at offset 10000 for 10000 rows
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 1 in 2813ms
writing 10000 rows
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: client-0 Finished randomRead
in 2867ms at offset 0 for 10000 rows
>>>> 09/08/18 08:20:41 INFO hbase.PerformanceEvaluation: Finished 0 in 2867ms
writing 10000 rows
>>>>
>>>>
>>>> So, looks like it is taking around 280ms per record. Looking at the latest
hbase performance claims, I was expecting it below 10ms. Am  I doing something basically wrong,
since such a hiuge difference :( ? Please help me fix the latency.
>>>>
>>>> The machine config is:
>>>> Processors:    2 x Xeon L5420 2.50GHz (8 cores)
>>>> Memory:        13.7GB
>>>> 12 Disks of 1TB each.
>>>>
>>>> Let me know if you need anymore details
>>>>
>>>> Thanks,
>>>> Murali Krishna
> 

Mime
View raw message