Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C60D9108A0 for ; Thu, 9 Jan 2014 08:43:15 +0000 (UTC) Received: (qmail 78863 invoked by uid 500); 9 Jan 2014 08:41:57 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 78757 invoked by uid 500); 9 Jan 2014 08:41:43 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 78717 invoked by uid 99); 9 Jan 2014 08:41:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jan 2014 08:41:42 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ashjain2@gmail.com designates 74.125.82.43 as permitted sender) Received: from [74.125.82.43] (HELO mail-wg0-f43.google.com) (74.125.82.43) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jan 2014 08:41:35 +0000 Received: by mail-wg0-f43.google.com with SMTP id k14so2480450wgh.22 for ; Thu, 09 Jan 2014 00:41:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=t148Gqw/gO3XWfd1+UXT4MkAm1uNlEA8iz9aLqaJZIs=; b=ShYEm7gATLRXaTm4mz7zRjQcUuRNJmTZY2DPcV/HkqaDi4gkydWXj1ri0kEG8rGjk6 jHb0b2+QNtxgKgBJj+PG9BEjZXoDo1NcqR2vcmadSd4Lset20P3S35kKbBA+659T1i6O gMERl7z7TGjY1A2v0KJKpwyinJfMsQyCjYGcq9Qngbd1At6/koD3/mLLHdGIsP5iqHHs gIUys2Bl8rHXa/NvGvJ29bSWkiqsEwPUKIjAH+p7IfFjMn5c6uPQWrfqDPYLZSlkezhP UfLzilmma43vNskML16jhDC1tqyOOZXk5rJt7zDnEleAOcCZ50aRL6FgWu/9os9kOG4t 4q8A== MIME-Version: 1.0 X-Received: by 10.180.74.230 with SMTP id x6mr24947518wiv.29.1389256874774; Thu, 09 Jan 2014 00:41:14 -0800 (PST) Received: by 10.194.35.8 with HTTP; Thu, 9 Jan 2014 00:41:14 -0800 (PST) In-Reply-To: References: Date: Thu, 9 Jan 2014 14:11:14 +0530 Message-ID: Subject: Re: Distributing the code to multiple nodes From: Ashish Jain To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d0438957f85d41504ef859229 X-Virus-Checked: Checked by ClamAV on apache.org --f46d0438957f85d41504ef859229 Content-Type: text/plain; charset=ISO-8859-1 Hello Chris, I have now a cluster with 3 nodes and replication factor being 2. When I distribute a file I could see that there are replica of data available in other nodes. However when I run a map reduce job again only one node is serving all the request :(. Can you or anyone please provide some more inputs. Thanks Ashish On Wed, Jan 8, 2014 at 7:16 PM, Chris Mawata wrote: > 2 nodes and replication factor of 2 results in a replica of each block > present on each node. This would allow the possibility that a single node > would do the work and yet be data local. It will probably happen if that > single node has the needed capacity. More nodes than the replication > factor are needed to force distribution of the processing. > Chris > On Jan 8, 2014 7:35 AM, "Ashish Jain" wrote: > >> Guys, >> >> I am sure that only one node is being used. I just know ran the job again >> and could see that CPU usage only for one server going high other server >> CPU usage remains constant and hence it means other node is not being used. >> Can someone help me to debug this issue? >> >> ++Ashish >> >> >> On Wed, Jan 8, 2014 at 5:04 PM, Ashish Jain wrote: >> >>> Hello All, >>> >>> I have a 2 node hadoop cluster running with a replication factor of 2. I >>> have a file of size around 1 GB which when copied to HDFS is replicated to >>> both the nodes. Seeing the block info I can see the file has been >>> subdivided into 8 parts which means it has been subdivided into 8 blocks >>> each of size 128 MB. I use this file as input to run the word count >>> program. Some how I feel only one node is doing all the work and the code >>> is not distributed to other node. How can I make sure code is distributed >>> to both the nodes? Also is there a log or GUI which can be used for this? >>> Please note I am using the latest stable release that is 2.2.0. >>> >>> ++Ashish >>> >> >> --f46d0438957f85d41504ef859229 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hello Chris,

I have now a cluster w= ith 3 nodes and replication factor being 2. When I distribute a file I coul= d see that there are replica of data available in other nodes. However when= I run a map reduce job again only one node is serving all the request :(. = Can you or anyone please provide some more inputs.

Thanks
Ashish


On Wed, Jan 8, 2014 at 7:16 PM, Chris Mawata <= chris.mawata@gmail.com> wrote:

2 nodes and replication facto= r of 2 results in a replica of each block present on each node. This would = allow the possibility that a single node would do the work and yet be data = local.=A0 It will probably happen if that single node has the needed capaci= ty.=A0 More nodes than the replication factor are needed to force distribut= ion of the processing.
Chris

On Jan 8, 2014 7:35 AM, "Ashish Jain" = <ashjain2@gmail.= com> wrote:
Guys,

I am sure that only one node = is being used. I just know ran the job again and could see that CPU usage o= nly for one server going high other server CPU usage remains constant and h= ence it means other node is not being used. Can someone help me to debug th= is issue?

++Ashish


On Wed, Jan 8, 2014 at 5:04 PM, Ashish Jain <ashjain2@gm= ail.com> wrote:
Hello All,
I have a 2 node hadoop cluster running with a replication facto= r of 2. I have a file of size around 1 GB which when copied to HDFS is repl= icated to both the nodes. Seeing the block info I can see the file has been= subdivided into 8 parts which means it has been subdivided into 8 blocks e= ach of size 128 MB.=A0 I use this file as input to run the word count progr= am. Some how I feel only one node is doing all the work and the code is not= distributed to other node. How can I make sure code is distributed to both= the nodes? Also is there a log or GUI which can be used for this?
Please note I am using the latest stable release that is 2.2.0.=

++Ashish


--f46d0438957f85d41504ef859229--