Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 28751 invoked from network); 25 Jan 2010 18:21:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 25 Jan 2010 18:21:56 -0000 Received: (qmail 48448 invoked by uid 500); 25 Jan 2010 18:21:55 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 48399 invoked by uid 500); 25 Jan 2010 18:21:55 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 48373 invoked by uid 99); 25 Jan 2010 18:21:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Jan 2010 18:21:55 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Jan 2010 18:21:54 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id AADE329A0019 for ; Mon, 25 Jan 2010 10:21:34 -0800 (PST) Message-ID: <1582764576.16031264443694698.JavaMail.jira@brutus.apache.org> Date: Mon, 25 Jan 2010 18:21:34 +0000 (UTC) From: "stack (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Updated: (HBASE-2167) PE for IHBase In-Reply-To: <1854715401.15621264443214511.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-2167: ------------------------- Attachment: IdxPerformanceEvaluation.patch Adds modification to the PerformanceEvaluation class to facilitate a more extensible performance evaluation platform. Has a new addition, the 'filterScan' command, which, as the name suggests, performs scans using a filter. To run the test you'll need to: Include the contrib jars (export HBASE_CLASSPATH=(`find /home/stack/tmp/hadoop-hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n" ":"`) Set the 'hbase.hregion.impl' property to 'org.apache.hadoop.hbase.regionserver.IdxRegion' in your hbase-site.xml bin/hbase org.apache.hadoop.hbase.IdxPerformanceEvaluation randomWrite 1 bin/hbase org.apache.hadoop.hbase.IdxPerformanceEvaluation filterScan 1 bin/hbase org.apache.hadoop.hbase.IdxPerformanceEvaluation idxFilterScan 1 PE is toward the wrong end of the spectrum regards what suits IHBase with its "large, random" value. It uses loads of RAM. Writes are slowed because of index insertion of such a 'large' value. If a user did have PE-like values, then suggest that user extract a portion of the value (like the first 10 bytes) into a separate column.qualifier and index that. It would still provide a HUGE performance boost to scans without the huge memory footprint (writes would be slowed much less) Here are some initial times usin to complete 20 scans for 20 random values on a single node cluster with 1.5GB of memory allocated to the RS VM. Without an index: 732989ms at offset 0 for 1048576 rows With an index: 2160ms at offset 0 for 1048576 rows > PE for IHBase > ------------- > > Key: HBASE-2167 > URL: https://issues.apache.org/jira/browse/HBASE-2167 > Project: Hadoop HBase > Issue Type: Improvement > Reporter: stack > Fix For: 0.20.4 > > Attachments: IdxPerformanceEvaluation.patch > > > Add a PE that can be used by IHBase. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.