hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5001) Improve the performance of block cache keys
Date Sun, 11 Dec 2011 23:54:30 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167284#comment-13167284

Lars Hofhansl commented on HBASE-5001:

* Bytes.add(hfileNameInBytes, Bytes.toBytes(offset)) -> 0.07us

But byte[] cannot be directly use as key in a map, no? Would need to wrap in HashBytes, so:
* new HashedBytes(Bytes.add(x, Bytes.toBytes(offser))); -> 0.08us

Which brought me to a new idea, what if we have a CacheKey Object that takes a String and
a long:
* new CacheKey(hfileName, offset) -> 0.01us

That would be the cleanest design anyway. Cachkey would implement the proper equals and hashCode
The LruCache could just take CacheKey (or even just java.lang.Object) as cache key, that way
we can pass whatever.

> Improve the performance of block cache keys
> -------------------------------------------
>                 Key: HBASE-5001
>                 URL: https://issues.apache.org/jira/browse/HBASE-5001
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.4
>            Reporter: Jean-Daniel Cryans
>            Priority: Minor
>             Fix For: 0.94.0
> Doing a pure random read test on data that's 100% block cache, I see that we are spending
quite some time in getBlockCacheKey:
> {quote}
> "IPC Server handler 19 on 62023" daemon prio=10 tid=0x00007fe0501ff800 nid=0x6c87 runnable
>    java.lang.Thread.State: RUNNABLE
> 	at java.util.Arrays.copyOf(Arrays.java:2882)
> 	at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
> 	at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
> 	at java.lang.StringBuilder.append(StringBuilder.java:119)
> 	at org.apache.hadoop.hbase.io.hfile.HFile.getBlockCacheKey(HFile.java:457)
> 	at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:249)
> 	at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:209)
> 	at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:521)
> 	at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:536)
> 	at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:178)
> 	at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:111)
> 	at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekExactly(StoreFileScanner.java:219)
> 	at org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:80)
> 	at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1689)
> 	at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:2857)
> {quote}
> Since the HFile name size is known and the offset is a long, it should be possible to
allocate exactly what we need. Maybe use byte[] as the key and drop the separator too.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message