Return-Path: Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: (qmail 10181 invoked from network); 3 Sep 2010 18:04:58 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 3 Sep 2010 18:04:58 -0000 Received: (qmail 84291 invoked by uid 500); 3 Sep 2010 18:04:58 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 84260 invoked by uid 500); 3 Sep 2010 18:04:57 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 84252 invoked by uid 99); 3 Sep 2010 18:04:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Sep 2010 18:04:57 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Sep 2010 18:04:55 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o83I4XjR022972 for ; Fri, 3 Sep 2010 18:04:33 GMT Message-ID: <27650091.15491283537073457.JavaMail.jira@thor> Date: Fri, 3 Sep 2010 14:04:33 -0400 (EDT) From: "Benoit Sigoure (JIRA)" To: issues@hbase.apache.org Subject: [jira] Updated: (HBASE-2959) Scanning always starts at the beginning of a row In-Reply-To: <4942121.15371283536835541.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoit Sigoure updated HBASE-2959: ---------------------------------- Description: In HBASE-2248, the code in {{HRegion#get}} was changed like so: {code} - private void get(final Store store, final Get get, - final NavigableSet qualifiers, List result) - throws IOException { - store.get(get, qualifiers, result); + /* + * Do a get based on the get parameter. + */ + private List get(final Get get) throws IOException { + Scan scan = new Scan(get); + + List results = new ArrayList(); + + InternalScanner scanner = null; + try { + scanner = getScanner(scan); + scanner.next(results); + } finally { + if (scanner != null) + scanner.close(); + } + return results; } {code} So instead of doing a {{get}} straight on the {{Store}}, we now open a scanner. The problem is that we eventually end up in {{ScanQueryMatcher}} where the constructor does: {{this.startKey = KeyValue.createFirstOnRow(scan.getStartRow());}}. This entails that if we have a very wide row (thousands of columns), the scanner will need to go through thousands of {{KeyValue}}'s before finding the right entry, because it always starts from the beginning of the row, whereas before it was much more straightforward. This problem was under the radar for a while because the overhead isn't too unreasonable, but later on, {{incrementColumnValue}} was changed to do a {{get}} under the hood. At StumbleUpon we do thousands of ICV per second, so thousand of times per second we're scanning some really wide rows. When a row is contented, this results in all the IPC threads being stuck on acquiring a row lock, while one thread is doing the ICV (albeit slowly due to the excessive scanning). When all IPC threads are stuck, the region server is unable to serve more requests. As a nice side effect, fixing this bug will make {{get}} and {{incrementColumnValue}} faster, as well as the first call to {{next}} on a scanner. was: In HBASE-2248, the code in {{HRegion#get}} was changed like so: {code} - private void get(final Store store, final Get get, - final NavigableSet qualifiers, List result) - throws IOException { - store.get(get, qualifiers, result); + /* + * Do a get based on the get parameter. + */ + private List get(final Get get) throws IOException { + Scan scan = new Scan(get); + + List results = new ArrayList(); + + InternalScanner scanner = null; + try { + scanner = getScanner(scan); + scanner.next(results); + } finally { + if (scanner != null) + scanner.close(); + } + return results; } {code} So instead of doing a {{get}} straight on the {{Store}}, we now open a scanner. The problem is that we eventually end up in {{ScanQueryMatcher}} where the constructor does: {{ this.startKey = KeyValue.createFirstOnRow(scan.getStartRow());}}. This entails that if we have a very wide row (thousands of columns), the scanner will need to go through thousands of {{KeyValue}}s before finding the right entry, because it always starts from the beginning of the row, whereas before it was much more straightforward. This problem was under the radar for a while because the overhead isn't too unreasonable, but later on, {{incrementColumnValue}} was changed to do a {{get}} under the hood. At StumbleUpon we do thousands of ICV per second, so thousand of times per second we're scanning some really wide rows. When a row is contented, this results in all the IPC threads being stuck on acquiring a row lock, while one thread is doing the ICV (albeit slowly due to the excessive scanning). When all IPC threads are stuck, the region server is unable to serve more requests. As a nice side effect, fixing this bug will make {{get}} and {{incrementColumnValue}} faster, as well as the first call to {{next}} on a scanner. > Scanning always starts at the beginning of a row > ------------------------------------------------ > > Key: HBASE-2959 > URL: https://issues.apache.org/jira/browse/HBASE-2959 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.20.4, 0.20.5, 0.20.6, 0.89.20100621 > Reporter: Benoit Sigoure > Priority: Blocker > > In HBASE-2248, the code in {{HRegion#get}} was changed like so: > {code} > - private void get(final Store store, final Get get, > - final NavigableSet qualifiers, List result) > - throws IOException { > - store.get(get, qualifiers, result); > + /* > + * Do a get based on the get parameter. > + */ > + private List get(final Get get) throws IOException { > + Scan scan = new Scan(get); > + > + List results = new ArrayList(); > + > + InternalScanner scanner = null; > + try { > + scanner = getScanner(scan); > + scanner.next(results); > + } finally { > + if (scanner != null) > + scanner.close(); > + } > + return results; > } > {code} > So instead of doing a {{get}} straight on the {{Store}}, we now open a scanner. The problem is that we eventually end up in {{ScanQueryMatcher}} where the constructor does: {{this.startKey = KeyValue.createFirstOnRow(scan.getStartRow());}}. This entails that if we have a very wide row (thousands of columns), the scanner will need to go through thousands of {{KeyValue}}'s before finding the right entry, because it always starts from the beginning of the row, whereas before it was much more straightforward. > This problem was under the radar for a while because the overhead isn't too unreasonable, but later on, {{incrementColumnValue}} was changed to do a {{get}} under the hood. At StumbleUpon we do thousands of ICV per second, so thousand of times per second we're scanning some really wide rows. When a row is contented, this results in all the IPC threads being stuck on acquiring a row lock, while one thread is doing the ICV (albeit slowly due to the excessive scanning). When all IPC threads are stuck, the region server is unable to serve more requests. > As a nice side effect, fixing this bug will make {{get}} and {{incrementColumnValue}} faster, as well as the first call to {{next}} on a scanner. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.