Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C8BC95C3 for ; Tue, 26 Jun 2012 01:42:45 +0000 (UTC) Received: (qmail 64968 invoked by uid 500); 26 Jun 2012 01:42:45 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 64939 invoked by uid 500); 26 Jun 2012 01:42:45 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 64926 invoked by uid 99); 26 Jun 2012 01:42:45 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jun 2012 01:42:45 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 966BF142855 for ; Tue, 26 Jun 2012 01:42:44 +0000 (UTC) Date: Tue, 26 Jun 2012 01:42:44 +0000 (UTC) From: "Jieshan Bean (JIRA)" To: issues@hbase.apache.org Message-ID: <596054296.54551.1340674964619.JavaMail.jiratomcat@issues-vm> In-Reply-To: <870713301.4379.1339450663357.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401094#comment-13401094 ] Jieshan Bean commented on HBASE-6200: ------------------------------------- bq.So if I read that right 1.000.000 KV compares take 100ms without the patch and 121ms with the patch. 1,000,000 times called the below method(Will compare 2 times in each call), to be exact: {noformat} private void assertKVLessWithoutRow(KeyValue.KeyComparator c, int common, KeyValue less, KeyValue greater) { int cmp = c.compareIgnoringPrefix(common, less.getBuffer(), less.getOffset() + KeyValue.ROW_OFFSET, less.getKeyLength(), greater.getBuffer(), greater.getOffset() + KeyValue.ROW_OFFSET, greater.getKeyLength()); assertTrue(cmp < 0); cmp = c.compareIgnoringPrefix(common, greater.getBuffer(), greater.getOffset() + KeyValue.ROW_OFFSET, greater.getKeyLength(), less.getBuffer(), less.getOffset() + KeyValue.ROW_OFFSET, less.getKeyLength()); assertTrue(cmp > 0); } {noformat} Thanks, Lars & Ted. > KeyComparator.compareWithoutRow can be wrong when families have the same prefix > ------------------------------------------------------------------------------- > > Key: HBASE-6200 > URL: https://issues.apache.org/jira/browse/HBASE-6200 > Project: HBase > Issue Type: Bug > Affects Versions: 0.90.6, 0.92.1, 0.94.0 > Reporter: Jean-Daniel Cryans > Assignee: Jieshan Bean > Priority: Blocker > Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6200-trunk-v2.patch, HBASE-6200-90-v2.patch, HBASE-6200-90.patch, HBASE-6200-92-v2.patch, HBASE-6200-92.patch, HBASE-6200-94-v2.patch, HBASE-6200-94.patch, HBASE-6200-trunk-v2.patch, HBASE-6200-trunk.patch, PerformanceTestCase-6200-94.patch > > > As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh > Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so "f:a" is said to be bigger than "f1:", which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. > I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. > I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira