Return-Path: Delivered-To: apmail-hadoop-hbase-issues-archive@minotaur.apache.org Received: (qmail 71435 invoked from network); 8 Apr 2010 21:23:00 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Apr 2010 21:23:00 -0000 Received: (qmail 26515 invoked by uid 500); 8 Apr 2010 21:23:00 -0000 Delivered-To: apmail-hadoop-hbase-issues-archive@hadoop.apache.org Received: (qmail 26493 invoked by uid 500); 8 Apr 2010 21:23:00 -0000 Mailing-List: contact hbase-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hbase-issues@hadoop.apache.org Received: (qmail 26480 invoked by uid 99); 8 Apr 2010 21:23:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Apr 2010 21:23:00 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Apr 2010 21:22:57 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B3BFC234C48D for ; Thu, 8 Apr 2010 21:22:36 +0000 (UTC) Message-ID: <1395656772.21501270761756735.JavaMail.jira@brutus.apache.org> Date: Thu, 8 Apr 2010 21:22:36 +0000 (UTC) From: "stack (JIRA)" To: hbase-issues@hadoop.apache.org Subject: [jira] Commented: (HBASE-2248) Provide new non-copy mechanism to assure atomic reads in get and scan MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855160#action_12855160 ] stack commented on HBASE-2248: ------------------------------ Why do this? {code} - List results = new ArrayList(); - store.get(get, null, results); - return new Result(results); + get.addFamily(family); + return get(get, null); {code} Is it because of this....up in HRegion: {code} + // The reason why we set it up high is so that each HRegionScanner only + // has a single read point for all its sub-StoreScanners. + ReadWriteConsistencyControl.resetThreadReadPoint(rwcc); {code} This bit of the patch looks like its breaking the accumulation of qualifiers: {code} - List result = new ArrayList(1); - Get g = new Get(kv.getRow()); - g.setMaxVersions(count); - NavigableSet qualifiers = - new TreeSet(Bytes.BYTES_COMPARATOR); - qualifiers.add(qual); - get(store, g, qualifiers, result); + Get get = new Get(kv.getRow()); + get.setMaxVersions(count); + get.addColumn(family, qual); + + List result = get(get); + {code} Or is this no longer needed because we go back into Region.get rather than do a Store.get? I don't like this if clause style: + if (w != null) + rwcc.completeMemstoreInsert(w); I'd suggest you either wrap it in params or put it all on the one line. For example, undo changes like this I'd say: {code} - if(lockid == null) releaseRowLock(lid); + if(lockid == null) + releaseRowLock(lid); {code} I think this needs a comment: + private int isScan; or maybe where its assigned so its clear why it can't be a boolean though its named as though it were one... maybe change its name? I don't get this: {code} + // TODO call the proper GET API // Get the old value: Get get = new Get(row); {code} or this: {code} + //noinspection SuspiciousMethodCalls {code} Why this change Ryan? {code} -class KeyValueSkipListSet implements NavigableSet, Cloneable { +class KeyValueSkipListSet implements NavigableSet { {code} Its crazy how much code you've removed from MemStore around #195. Ok, thats enough for now > Provide new non-copy mechanism to assure atomic reads in get and scan > --------------------------------------------------------------------- > > Key: HBASE-2248 > URL: https://issues.apache.org/jira/browse/HBASE-2248 > Project: Hadoop HBase > Issue Type: Bug > Affects Versions: 0.20.3 > Reporter: Dave Latham > Priority: Blocker > Fix For: 0.20.4 > > Attachments: HBASE-2248-demonstrate-previous-impl-bugs.patch, HBASE-2248-GetsAsScans3.patch, HBASE-2248-rr-alpha1.txt, HBASE-2248-rr-alpha2.txt, HBASE-2248-rr-alpha3.txt, HBASE-2248-ryan.patch, hbase-2248.gc, HBASE-2248.patch, hbase-2248.txt, readownwrites-lost.2.patch, readownwrites-lost.patch, Screen shot 2010-02-23 at 10.33.38 AM.png, threads.txt > > > HBASE-2037 introduced a new MemStoreScanner which triggers a ConcurrentSkipListMap.buildFromSorted clone of the memstore and snapshot when starting a scan. > After upgrading to 0.20.3, we noticed a big slowdown in our use of short scans. Some of our data repesent a time series. The data is stored in time series order, MR jobs often insert/update new data at the end of the series, and queries usually have to pick up some or all of the series. These are often scans of 0-100 rows at a time. To load one page, we'll observe about 20 such scans being triggered concurrently, and they take 2 seconds to complete. Doing a thread dump of a region server shows many threads in ConcurrentSkipListMap.biuldFromSorted which traverses the entire map of key values to copy it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.