Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 54406 invoked from network); 23 Feb 2010 17:22:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 23 Feb 2010 17:22:49 -0000 Received: (qmail 53218 invoked by uid 500); 23 Feb 2010 17:22:48 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 53168 invoked by uid 500); 23 Feb 2010 17:22:48 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 52956 invoked by uid 99); 23 Feb 2010 17:22:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Feb 2010 17:22:48 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Feb 2010 17:22:48 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id E0921234C052 for ; Tue, 23 Feb 2010 09:22:27 -0800 (PST) Message-ID: <1793846036.465401266945747918.JavaMail.jira@brutus.apache.org> Date: Tue, 23 Feb 2010 17:22:27 +0000 (UTC) From: "Dave Latham (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Commented: (HBASE-2248) New MemStoreScanner copies memstore for each scan, makes short scans slow In-Reply-To: <174857300.448551266881787938.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837326#action_12837326 ] Dave Latham commented on HBASE-2248: ------------------------------------ Thanks, Dan, and others for looking into this issue. The table where we were seeing these slow scans was definitely a tall, narrow table. Each row has one cell, the column family and qualifier are each one byte. The row varies, but is typically 8-20 bytes, and the value is usually 4 bytes or less. Most common is probably row - 12 bytes, col fam - 1 byte, qualifier 1 byte, value - 3 bytes, giving 17 bytes plus overhead. As I was trying to understand the discrepancy between the PE results you mentioned and what I've observed, I looked in to PerformanceEvaluation. It looks like the timer only starts after the scanner is constructed which means that the MemStore clone isn't being timed as part of the test, so that would probably explain why the test seems fast. Just reasoning, it seems hard to believe that ConcurrentSkipListMap.buildFromSorted could complete a million iterations that fast. > New MemStoreScanner copies memstore for each scan, makes short scans slow > ------------------------------------------------------------------------- > > Key: HBASE-2248 > URL: https://issues.apache.org/jira/browse/HBASE-2248 > Project: Hadoop HBase > Issue Type: Bug > Affects Versions: 0.20.3 > Reporter: Dave Latham > Fix For: 0.20.4 > > Attachments: threads.txt > > > HBASE-2037 introduced a new MemStoreScanner which triggers a ConcurrentSkipListMap.buildFromSorted clone of the memstore and snapshot when starting a scan. > After upgrading to 0.20.3, we noticed a big slowdown in our use of short scans. Some of our data repesent a time series. The data is stored in time series order, MR jobs often insert/update new data at the end of the series, and queries usually have to pick up some or all of the series. These are often scans of 0-100 rows at a time. To load one page, we'll observe about 20 such scans being triggered concurrently, and they take 2 seconds to complete. Doing a thread dump of a region server shows many threads in ConcurrentSkipListMap.biuldFromSorted which traverses the entire map of key values to copy it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.