Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 58114 invoked from network); 24 Jun 2009 16:06:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Jun 2009 16:06:35 -0000 Received: (qmail 15324 invoked by uid 500); 24 Jun 2009 16:06:46 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 15301 invoked by uid 500); 24 Jun 2009 16:06:46 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 15291 invoked by uid 99); 24 Jun 2009 16:06:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Jun 2009 16:06:46 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erikholstad@gmail.com designates 209.85.132.249 as permitted sender) Received: from [209.85.132.249] (HELO an-out-0708.google.com) (209.85.132.249) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Jun 2009 16:06:38 +0000 Received: by an-out-0708.google.com with SMTP id c5so263831anc.29 for ; Wed, 24 Jun 2009 09:06:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=dGGJGk4h78Q9hrmRXpSKtGV6l2+A+T9+O6YducPFzNw=; b=o0oyD0JcBicJiNY8SjFUV5QeCfUHB82ya74REVH9Q4SctDaCj7fGCdW1bCIY2Fqq2b DK2vzlbc+OaYw9+4gW6JsHL+U5iVattfJey7wWKMAHz6yaJcPpZMXLTjvUeceraTlmE5 KONg98NuFZkHRKUtY2U6M2fIQEeq71wvSJMSM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=a2F532QLgzYNi7xUSf+FhDcMiGubKKqFAOam2Cx2xfZhbArXnuxfFGAtEItG/u6kb1 t974Ug+bVMTGWziULkwvIlwbURiVeRXxEYJAV94tlh3rcl1yq4R3jQj6JSaDuTEUcSnV 1ojhuTJ06uyCUbKEZ/tvB3SKPcB11ofNyIKZQ= MIME-Version: 1.0 Received: by 10.100.153.12 with SMTP id a12mr1876060ane.191.1245859560129; Wed, 24 Jun 2009 09:06:00 -0700 (PDT) In-Reply-To: <32120a6a0906230830t2f23b58h6777d1d0a65e53e7@mail.gmail.com> References: <24166190.post@talk.nabble.com> <32120a6a0906230830t2f23b58h6777d1d0a65e53e7@mail.gmail.com> Date: Wed, 24 Jun 2009 09:06:00 -0700 Message-ID: <74f4d40b0906240906yaab4f17pd5a7564e02e8e6cb@mail.gmail.com> Subject: Re: Map Reduce performance From: Erik Holstad To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e6434aae8612ee046d1a4855 X-Virus-Checked: Checked by ClamAV on apache.org --0016e6434aae8612ee046d1a4855 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hi Ramesh! Have to agree with Tim about the size of your cluster, I honestly a little bit surprised that you are actually seeing that using MR on a single node is faster, since you only get the negative sides, setup and so on from it, but not the good stuff. I looked at the code and it looks good, not really doing to much in the Job, but I doesn't look like you are doing anything wrong. I do have some things you can think about thought when you get a bigger cluster up and running. 1. You might want to stay away from creating Text object, we are internally trying to move away from all usage of Text in HBase and just use ImmutableBytesWritable or something like that. 2. Getting a HTable is expensive, so you might want to create a pool of those connections that you can share so you don't have to get a new one for every task, not 100% sure about the configure call, but I think it gives you one per call, might be worth looking into. Erik --0016e6434aae8612ee046d1a4855--