hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7279) Avoid copying the rowkey in RegionScanner, StoreScanner, and ScanQueryMatcher
Date Wed, 05 Dec 2012 06:08:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510295#comment-13510295
] 

Lars Hofhansl commented on HBASE-7279:
--------------------------------------

You leave out the timestamp cache, or leave out the change that removes the timestamp cache?
:)  I can go either way.

However, 8 bytes is not insignificant (the rest of a KV is just 16 + 24 + 4 + 4 + 4 + 8 =
52). (makes me want to remove the keyLength cache as well for another 4 bytes)

At Salesforce we're doing some scans over close to 1bn rows/kvs (most of which won't be shipped
to the client).
The issue with the timestamp cache is that it will use 8 bytes, whether we cache anything
or not. Over the 1bn KVs we'll produce 8GB of garbage just for this cache. 

I would like to put this into 0.94 as well.

                
> Avoid copying the rowkey in RegionScanner, StoreScanner, and ScanQueryMatcher
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-7279
>                 URL: https://issues.apache.org/jira/browse/HBASE-7279
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.96.0, 0.94.4
>
>         Attachments: 7279-0.94.txt
>
>
> Did some profiling again.
> I we can gain some performance [1] when passing buffer, rowoffset, and rowlength instead
of making a copy of the row key.
> That way we can also remove the row key caching (and this patch also removes the timestamps
caching). Considering the sheer number in which we create KVs, every byte save is good.
> [1] (15-20% when data is in the block cache we setup a Filter such that only a single
row is returned to the client).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message