Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 84911 invoked from network); 6 Jul 2009 18:25:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Jul 2009 18:25:07 -0000 Received: (qmail 85984 invoked by uid 500); 6 Jul 2009 18:25:17 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 85943 invoked by uid 500); 6 Jul 2009 18:25:17 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 85933 invoked by uid 99); 6 Jul 2009 18:25:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jul 2009 18:25:17 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of irfan.ma@gmail.com designates 209.85.210.185 as permitted sender) Received: from [209.85.210.185] (HELO mail-yx0-f185.google.com) (209.85.210.185) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jul 2009 18:25:06 +0000 Received: by yxe15 with SMTP id 15so3710780yxe.5 for ; Mon, 06 Jul 2009 11:24:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:message-id :in-reply-to:subject:mime-version:content-type :content-transfer-encoding; bh=LPivemkgNV5Lw4OhtOQFQD+TxmNKuivov7EhJiDC7WE=; b=UpICehd0S6c1Kei3EfnMq7w3Z4erXBwQx3WMXamwUoLPUSUsuQbbadlMDOpxBlbwbZ rZYSOaw64ROHrtWLOMuPcmKk9+kdMZuSfwRtX27th9aR8dDyfGPAb7vSDtzespqdSlMG 5JJIPls+OHXVOnISoupboovOmJkyqAlzGB0go= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:message-id:in-reply-to:subject:mime-version :content-type:content-transfer-encoding; b=KPOp1VFR95jMHwg3y8ccEypL/o/ev/6VWb9K6TxmvViq1rBlFXQl9vaKC4VIeEqrcR weltq2M5hI3B2i7f7amYrpTENwDMMzDR2sLqki2nZQZJFp5CLRFzY8K1NdKojrTpwTAU l/Pin1aybZSapI7/qIoRY3mnOmBcbTmEoyfME= Received: by 10.90.80.18 with SMTP id d18mr4389034agb.44.1246904686020; Mon, 06 Jul 2009 11:24:46 -0700 (PDT) Received: from localhost ([216.236.237.197]) by mx.google.com with ESMTPS id 39sm146181aga.21.2009.07.06.11.24.44 (version=SSLv3 cipher=RC4-MD5); Mon, 06 Jul 2009 11:24:45 -0700 (PDT) Date: Mon, 6 Jul 2009 14:24:43 -0400 (EDT) From: Irfan Mohammed To: hbase-dev@hadoop.apache.org Message-ID: <31211275.2901246904679765.JavaMail.irfan@damascus> In-Reply-To: <7c962aed0907061114s19ff8c29w8fb070f40455712f@mail.gmail.com> Subject: Re: performance help MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Input is 1 file. These are 4 different tables "txn_m1", "txn_m2", "txn_m3", "txn_m4". To me, it looks like it is always doing 1 region per table and these tables are always on different regionservers. I never seen the same table on different regionservers. Does that sound right? ----- Original Message ----- From: "stack" To: hbase-dev@hadoop.apache.org Sent: Monday, July 6, 2009 2:14:43 PM GMT -05:00 US/Canada Eastern Subject: Re: performance help On Mon, Jul 6, 2009 at 11:06 AM, Irfan Mohammed wrote: > I am working on writing to HDFS files. Will update you by end of day today. > > There are always 10 concurrent mappers running. I keep setting the > setNumMaps(5) and also the following properties in mapred-site.xml to 3 but > still end up running 10 concurrent maps. > Is your input ten files? > > There are 5 regionservers and the online regions are as follows : > > m1 : -ROOT-,,0 > m2 : txn_m1,,1245462904101 > m3 : txn_m4,,1245462942282 > m4 : txn_m2,,1245462890248 > m5 : .META.,,1 > txn_m3,,1245460727203 > So, that looks like 4 regions from table txn? So thats about 1 region per regionserver? > I have setAutoFlush(false) and also writeToWal(false) with the same > behaviour. > If you did above and still takes 10 minutes, then that would seem to rule out hbase (batching should have big impact on uploads and then setting writeToWAL to false, should double throughput over whatever you were seeing previous). St.Ack