Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8E54278DB for ; Mon, 29 Aug 2011 05:00:29 +0000 (UTC) Received: (qmail 45366 invoked by uid 500); 29 Aug 2011 05:00:22 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 45018 invoked by uid 500); 29 Aug 2011 04:59:53 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 44522 invoked by uid 99); 29 Aug 2011 04:59:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Aug 2011 04:59:19 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI X-Spam-Check-By: apache.org Received-SPF: unknown (athena.apache.org: error in processing during lookup of mingma@ebay.com) Received: from [216.113.175.153] (HELO den-mipot-002.corp.ebay.com) (216.113.175.153) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Aug 2011 04:59:11 +0000 DomainKey-Signature: s=corp; d=ebay.com; c=nofws; q=dns; h=X-EBay-Corp:X-IronPort-AV:Received:Received:From:To:Date: Subject:Thread-Topic:Thread-Index:Message-ID:References: In-Reply-To:Accept-Language:Content-Language: X-MS-Has-Attach:X-MS-TNEF-Correlator:acceptlanguage: x-ems-proccessed:x-ems-stamp:Content-Type: Content-Transfer-Encoding:MIME-Version:X-CFilter; b=SAKhSgtvDXf5Fmf7lsttoG6WKrTvbHVxb7XInTppO84s2tfU1dScH/dI nHqt05gX56yowD/ZNt+WaJ+Ft8YiUGzQ+oBrLi94ehIOO2/TO+sorcsqR e19DO7rjtNuSUJF; DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=ebay.com; i=mingma@ebay.com; q=dns/txt; s=corp; t=1314593951; x=1346129951; h=from:to:date:subject:message-id:references:in-reply-to: content-transfer-encoding:mime-version; bh=M5NFWGXmfhV/vtrXvBbMZM1IdgZl4v6KfLXR948DfQ0=; b=ngArpVWdTH1uwTUOIKDkk6AlawB52lzcCDxk5V9jejCqUkH9dtOexdHh 3Vw148tUFJvq09HApXmkz4hXHJNKTV2b5uNKxk/RZb7Zox1mTFAUk9/p8 ASEo1n6foOwIxFv; X-EBay-Corp: Yes X-IronPort-AV: E=Sophos;i="4.68,295,1312182000"; d="scan'208";a="3698981" Received: from den-vtenf-001.corp.ebay.com (HELO DEN-MEXHT-003.corp.ebay.com) ([10.101.112.212]) by den-mipot-002.corp.ebay.com with ESMTP; 28 Aug 2011 21:58:50 -0700 Received: from DEN-MEXMS-001.corp.ebay.com ([10.241.16.225]) by DEN-MEXHT-003.corp.ebay.com ([10.241.17.54]) with mapi; Sun, 28 Aug 2011 22:58:49 -0600 From: "Ma, Ming" To: "user@hbase.apache.org" Date: Sun, 28 Aug 2011 22:58:46 -0600 Subject: RE: Number of map jobs per region Thread-Topic: Number of map jobs per region Thread-Index: AcxlYfZTxQsH+EB1Q3Scz+zmiSoZ6QApMf4Q Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US x-ems-proccessed: 10SqDH0iR7ekR7SRpKqm5A== x-ems-stamp: SnS3hk2ln++hmRzzz9Wa1w== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter: Scanned Dhaval, You might find https://issues.apache.org/jira/browse/HBASE-4063 useful when= it is ready. Of course, you can always use your own customized version of = TableInputFormat. https://issues.apache.org/jira/browse/HBASE-4039 allows y= ou to provide your own TableInputFormat to TableMapReduceUtil. Ming -----Original Message----- From: Dhaval Makawana [mailto:dhaval.makawana@gmail.com]=20 Sent: Sunday, August 28, 2011 2:06 AM To: user@hbase.apache.org Subject: Number of map jobs per region Hi, We have 31 regions for a table in our HBase system and hence while scanning the table via TableMapper, it creates 31 maps. Following is the line from documentation where I got the reason for the same. "Reading from HBase, the TableInputFormat asks HBase for the list of region= s and makes a map-per-region or mapred.map.tasks maps, whichever is smaller " ( http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-s= ummary.html ) Each region file size is almost 7 GB(lzo compressed data) and map jobs are taking huge time to processed the data. Is there any way to increase parallelism(allocate more maps per region)? Regards, Dhaval