hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varun Sharma <va...@pinterest.com>
Subject Re: Poor HBase random read performance
Date Sat, 29 Jun 2013 22:39:41 GMT
So, I just major compacted the table which initially had 3 store files and
performance went 3X from 1.6M to 4M+.

The tests I am running, have 8 byte keys with ~ 80-100 byte values. Right
now i am working with 64K block size, I am going to make it 8K and see if
that helps.

The one point though is the IdLock mechanism - that seems to add a huge
amount of overhead 2x - however in that test I was not caching index blocks
in the block cache, which means a lot higher contention on those blocks. I
believe it was used so that we dont load the same block twice from disk. I
am wondering, when IOPs are surplus (ssds for example), if we should have
an option to disable it though I probably should reevaluate it, with index
blocks in block cache.


On Sat, Jun 29, 2013 at 3:24 PM, lars hofhansl <larsh@apache.org> wrote:

> Should also say that random reads this way are somewhat of a worst case
> scenario.
>
> If the working set is much larger than the block cache and the reads are
> random, then each read will likely have to bring in an entirely new block
> from the OS cache,
> even when the KVs are much smaller than a block.
>
> So in order to read a (say) 1k KV HBase needs to bring 64k (default block
> size) from the OS cache.
> As long as the dataset fits into the block cache this difference in size
> has no performance impact, but as soon as the dataset does not fit, we have
> to bring much more data from the OS cache than we're actually interested in.
>
> Indeed in my test I found that HBase brings in about 60x the data size
> from the OS cache (used PE with ~1k KVs). This can be improved with smaller
> block sizes; and with a more efficient way to instantiate HFile blocks in
> Java (which we need to work on).
>
>
> -- Lars
>
> ________________________________
> From: lars hofhansl <larsh@apache.org>
> To: "dev@hbase.apache.org" <dev@hbase.apache.org>
> Sent: Saturday, June 29, 2013 3:09 PM
> Subject: Re: Poor HBase random read performance
>
>
> I've seen the same bad performance behavior when I tested this on a real
> cluster. (I think it was in 0.94.6)
>
>
> Instead of en/disabling the blockcache, I tested sequential and random
> reads on a data set that does not fit into the (aggregate) block cache.
> Sequential reads were drastically faster than Random reads (7 vs 34
> minutes), which can really only be explained with the fact that the next
> get will with high probability hit an already cached block, whereas in the
> random read case it likely will not.
>
> In the RandomRead case I estimate that each RegionServer brings in between
> 100 and 200mb/s from the OS cache. Even at 200mb/s this would be quite
> slow.I understand that performance is bad when index/bloom blocks are not
> cached, but bringing in data blocks from the OS cache should be faster than
> it is.
>
>
> So this is something to debug.
>
> -- Lars
>
>
>
> ________________________________
> From: Varun Sharma <varun@pinterest.com>
> To: "dev@hbase.apache.org" <dev@hbase.apache.org>
> Sent: Saturday, June 29, 2013 12:13 PM
> Subject: Poor HBase random read performance
>
>
> Hi,
>
> I was doing some tests on how good HBase random reads are. The setup is
> consists of a 1 node cluster with dfs replication set to 1. Short circuit
> local reads and HBase checksums are enabled. The data set is small enough
> to be largely cached in the filesystem cache - 10G on a 60G machine.
>
> Client sends out multi-get operations in batches to 10 and I try to measure
> throughput.
>
> Test #1
>
> All Data was cached in the block cache.
>
> Test Time = 120 seconds
> Num Read Ops = 12M
>
> Throughput = 100K per second
>
> Test #2
>
> I disable block cache. But now all the data is in the file system cache. I
> verify this by making sure that IOPs on the disk drive are 0 during the
> test. I run the same test with batched ops.
>
> Test Time = 120 seconds
> Num Read Ops = 0.6M
> Throughput = 5K per second
>
> Test #3
>
> I saw all the threads are now stuck in idLock.lockEntry(). So I now run
> with the lock disabled and the block cache disabled.
>
> Test Time = 120 seconds
> Num Read Ops = 1.2M
> Throughput = 10K per second
>
> Test #4
>
> I re enable block cache and this time hack hbase to only cache Index and
> Bloom blocks but data blocks come from File System cache.
>
> Test Time = 120 seconds
> Num Read Ops = 1.6M
> Throughput = 13K per second
>
> So, I wonder how come such a massive drop in throughput. I know that HDFS
> code adds tremendous overhead but this seems pretty high to me. I use
> 0.94.7 and cdh 4.2.0
>
> Thanks
> Varun
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message