hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Hood <tom.w.h...@gmail.com>
Subject Re: Performance between HBaseClient scan and HFileReaderV2
Date Mon, 30 Dec 2013 02:09:20 GMT
I'm also new to HBase and am not familiar with HFileReaderV2.  However, in
your description, you didn't mention anything about clearing the linux OS
cache between tests.  That might be why you're seeing the big difference if
you ran the HBaseClient test first, it may have warmed the OS cache and
then HFileReaderV2 benefited from it.  Just a guess...

-- Tom

On Mon, Dec 23, 2013 at 12:18 PM, Jerry Lam <chilinglam@gmail.com> wrote:

> Hello HBase users,
> I just ran a very simple performance test and would like to see if what I
> experienced make sense.
> The experiment is as follows:
> - I filled a hbase region with 700MB data (each row has roughly 45 columns
> and the size is 20KB for the entire row)
> - I configured the region to hold 4GB (therefore no split occurs)
> - I ran compactions after the data is loaded and make sure that there is
> only 1 region in the table under test.
> - No other table exists in the hbase cluster because this is a DEV
> environment
> - I'm using HBase 0.92.1
> The test is very basic. I use HBaseClient to scan the entire region to
> retrieve all rows and all columns in the table, just iterating all KeyValue
> pairs until it is done. It took about 1 minute 22 sec to complete. (Note
> that I disable block cache and uses caching size about 10000).
> I ran another test using HFileReaderV2 and scan the entire region to
> retrieve all rows and all columns, just iterating all keyValue pairs until
> it is done. It took 11 sec.
> The performance difference is dramatic (almost 8 times faster using
> HFileReaderV2).
> I want to know why the difference is so big or I didn't configure HBase
> properly. From this experiment, HDFS can deliver the data efficiently so it
> is not the bottleneck.
> Any help is appreciated!
> Jerry

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message