Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 173C866A2 for ; Sat, 23 Jul 2011 22:23:35 +0000 (UTC) Received: (qmail 32345 invoked by uid 500); 23 Jul 2011 22:23:34 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 32307 invoked by uid 500); 23 Jul 2011 22:23:34 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 32299 invoked by uid 99); 23 Jul 2011 22:23:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Jul 2011 22:23:33 +0000 X-ASF-Spam-Status: No, hits=-2001.1 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Jul 2011 22:23:31 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id F40A659488 for ; Sat, 23 Jul 2011 22:23:09 +0000 (UTC) Date: Sat, 23 Jul 2011 22:23:09 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: <955019478.1523.1311459789996.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-1938) Make in-memory table scanning faster MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070067#comment-13070067 ] stack commented on HBASE-1938: ------------------------------ bq. I modified the unit test to make it work with the trunk as it is today (new file attached). Thanks. Reviewing it, one thing you might want to do is study classes in hbase so get gist of the hadoop/hbase style. Notice how they have two spaces for tabs, ~80 chars a line. But thats for future. Not important here. You just need to make sure your KVs have a readPoint that is less than the current readPoint. It looks like you are making KVs w/o setting memstorets. Default then is used and its zero. The default read point is zero. The compare is <= so it looks like you don't need to set the read point at all. What you have should be no harm. Your new test class seems fine. Would be nice to add more tests. As memstore data structure grows, all slows. Another issue is about hacking on the concurrentskiplistset that is memstore to make it more suited to our accesses and perhaps to make it go faster (its public domain when you dig down into the java src). bq. On a scan the "next()" part, the hbase currently compare the value of two internals iterators. In this test, the second list is always empty, hence the cost on comparator is lowered vs. real life. What is this that you are referring too? Is it this? KeyValue kv = scanner.next(); bq. But I don't think it worth a patch just for this (it should be included in a bigger patch hoewever). Up to you but yes, the above is probably the way to go. Thanks N. > Make in-memory table scanning faster > ------------------------------------ > > Key: HBASE-1938 > URL: https://issues.apache.org/jira/browse/HBASE-1938 > Project: HBase > Issue Type: Improvement > Components: performance > Reporter: stack > Assignee: stack > Priority: Blocker > Attachments: MemStoreScanPerformance.java, MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch > > > This issue is about profiling hbase to see if I can make hbase scans run faster when all is up in memory. Talking to some users, they are seeing about 1/4 million rows a second. It should be able to go faster than this (Scanning an array of objects, they can do about 4-5x this). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira