hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nkeywal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-1938) Make in-memory table scanning faster
Date Thu, 28 Jul 2011 10:24:10 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072284#comment-13072284
] 

nkeywal commented on HBASE-1938:
--------------------------------

Ok, I understand what's going on.

The enhancement on readPoint aims at calling a TLS once instead of twice by call to next().

This should work well when the kvset and snapshot lists are not empty. However, in the unit
test, the snapshot list is empty, so we were already calling the TLS only once before.

I will write a second test to highlight the difference.


This said, as Andrew and Todd think that a modification on readPoint could change the consistency
behavior, I don't think it's worth doing the modification.  

So if you aggree, I will:
- write a test with a non empty snapshot
- provide a patch on MemStore with all the changes except the readPoint

With the iterator change, that makes a big boost on the scan perf already.

> Make in-memory table scanning faster
> ------------------------------------
>
>                 Key: HBASE-1938
>                 URL: https://issues.apache.org/jira/browse/HBASE-1938
>             Project: HBase
>          Issue Type: Improvement
>          Components: performance
>            Reporter: stack
>            Assignee: nkeywal
>            Priority: Blocker
>             Fix For: 0.90.4, 0.92.0
>
>         Attachments: 20110726_1938_KeyValueSkipListSet.patch, 20110726_1938_MemStore.patch,
20110726_1938_MemStoreScanPerformance.java, MemStoreScanPerformance.java, MemStoreScanPerformance.java,
caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run faster when
all is up in memory.  Talking to some users, they are seeing about 1/4 million rows a second.
 It should be able to go faster than this (Scanning an array of objects, they can do about
4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message