Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4A65F1036B for ; Thu, 9 Jan 2014 13:58:33 +0000 (UTC) Received: (qmail 15715 invoked by uid 500); 9 Jan 2014 13:58:17 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 14594 invoked by uid 500); 9 Jan 2014 13:58:09 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 14583 invoked by uid 99); 9 Jan 2014 13:58:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jan 2014 13:58:08 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of chris.mawata@gmail.com designates 209.85.223.169 as permitted sender) Received: from [209.85.223.169] (HELO mail-ie0-f169.google.com) (209.85.223.169) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jan 2014 13:58:03 +0000 Received: by mail-ie0-f169.google.com with SMTP id e14so3582309iej.14 for ; Thu, 09 Jan 2014 05:57:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=LmfyAxN1Ec+th3JfUS+sFIwX/ogX9WObyh8k35Cl8cE=; b=fy+G7K2RMmB0Stsu75AmLKsf+8mniCxqV4vqAAgpZmSrcBafCjpYaSAR4H8zwV1G7X nJLcIlrsqRqOt3chxIkeSXXQHLaUvU1IsgoACjtSQ/YpqUTA4N6CwP2WnLXnPSMZRd7B FTDAYVS6CfNK3ciCnZlA6aKKHHZpKVrr39+8KUm7D8ZjZ8s74OTBnTPVnuuoXNidPL25 Jw/l6JhRApp86Fj/RZ81xszd7hb2J0Mwd8Fo94fOPgtUww6UKLVCvpuMe4KXQ7Y7jNYx NPH9xXEXwF9QYsxsu7ZxMUO37JdyQuE91lkU+L7rnFBv31rshy4sgWmojIcG1N0+P9QL k+Aw== MIME-Version: 1.0 X-Received: by 10.43.82.69 with SMTP id ab5mr231983icc.95.1389275862518; Thu, 09 Jan 2014 05:57:42 -0800 (PST) Received: by 10.50.221.164 with HTTP; Thu, 9 Jan 2014 05:57:42 -0800 (PST) Received: by 10.50.221.164 with HTTP; Thu, 9 Jan 2014 05:57:42 -0800 (PST) In-Reply-To: References: Date: Thu, 9 Jan 2014 08:57:42 -0500 Message-ID: Subject: Re: Distributing the code to multiple nodes From: Chris Mawata To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=bcaec51865f847d36a04ef89fe5d X-Virus-Checked: Checked by ClamAV on apache.org --bcaec51865f847d36a04ef89fe5d Content-Type: text/plain; charset=ISO-8859-1 ...And do all three nodes appear in the NameNode and YARN web user interfaces? Chris On Jan 9, 2014 7:46 AM, "Ashish Jain" wrote: > Another point to add here 10.12.11.210 is the host which has everything > running including a slave datanode. Data was also distributed this host as > well as the jar file. Following are running on 10.12.11.210 > > 7966 DataNode > 8480 NodeManager > 8353 ResourceManager > 8141 SecondaryNameNode > 7834 NameNode > > > > On Thu, Jan 9, 2014 at 6:12 PM, Ashish Jain wrote: > >> Logs were updated only when I copied the data. After copying the data >> there has been no updates on the log files. >> >> >> On Thu, Jan 9, 2014 at 5:08 PM, Chris Mawata wrote: >> >>> Do the logs on the three nodes contain anything interesting? >>> Chris >>> On Jan 9, 2014 3:47 AM, "Ashish Jain" wrote: >>> >>>> Here is the block info for the record I distributed. As can be seen >>>> only 10.12.11.210 has all the data and this is the node which is serving >>>> all the request. Replicas are available with 209 as well as 210 >>>> >>>> 1073741857: 10.12.11.210:50010 View Block Info >>>> 10.12.11.209:50010 View Block Info >>>> 1073741858: 10.12.11.210:50010 View Block Info >>>> 10.12.11.211:50010 View Block Info >>>> 1073741859: 10.12.11.210:50010 View Block Info >>>> 10.12.11.209:50010 View Block Info >>>> 1073741860: 10.12.11.210:50010 View Block Info >>>> 10.12.11.211:50010 View Block Info >>>> 1073741861: 10.12.11.210:50010 View Block Info >>>> 10.12.11.209:50010 View Block Info >>>> 1073741862: 10.12.11.210:50010 View Block Info >>>> 10.12.11.209:50010 View Block Info >>>> 1073741863: 10.12.11.210:50010 View Block Info >>>> 10.12.11.209:50010 View Block Info >>>> 1073741864: 10.12.11.210:50010 View Block Info >>>> 10.12.11.209:50010 View Block Info >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> --Ashish >>>> >>>> >>>> On Thu, Jan 9, 2014 at 2:11 PM, Ashish Jain wrote: >>>> >>>>> Hello Chris, >>>>> >>>>> I have now a cluster with 3 nodes and replication factor being 2. When >>>>> I distribute a file I could see that there are replica of data available in >>>>> other nodes. However when I run a map reduce job again only one node is >>>>> serving all the request :(. Can you or anyone please provide some more >>>>> inputs. >>>>> >>>>> Thanks >>>>> Ashish >>>>> >>>>> >>>>> On Wed, Jan 8, 2014 at 7:16 PM, Chris Mawata wrote: >>>>> >>>>>> 2 nodes and replication factor of 2 results in a replica of each >>>>>> block present on each node. This would allow the possibility that a single >>>>>> node would do the work and yet be data local. It will probably happen if >>>>>> that single node has the needed capacity. More nodes than the replication >>>>>> factor are needed to force distribution of the processing. >>>>>> Chris >>>>>> On Jan 8, 2014 7:35 AM, "Ashish Jain" wrote: >>>>>> >>>>>>> Guys, >>>>>>> >>>>>>> I am sure that only one node is being used. I just know ran the job >>>>>>> again and could see that CPU usage only for one server going high other >>>>>>> server CPU usage remains constant and hence it means other node is not >>>>>>> being used. Can someone help me to debug this issue? >>>>>>> >>>>>>> ++Ashish >>>>>>> >>>>>>> >>>>>>> On Wed, Jan 8, 2014 at 5:04 PM, Ashish Jain wrote: >>>>>>> >>>>>>>> Hello All, >>>>>>>> >>>>>>>> I have a 2 node hadoop cluster running with a replication factor of >>>>>>>> 2. I have a file of size around 1 GB which when copied to HDFS is >>>>>>>> replicated to both the nodes. Seeing the block info I can see the file has >>>>>>>> been subdivided into 8 parts which means it has been subdivided into 8 >>>>>>>> blocks each of size 128 MB. I use this file as input to run the word count >>>>>>>> program. Some how I feel only one node is doing all the work and the code >>>>>>>> is not distributed to other node. How can I make sure code is distributed >>>>>>>> to both the nodes? Also is there a log or GUI which can be used for this? >>>>>>>> Please note I am using the latest stable release that is 2.2.0. >>>>>>>> >>>>>>>> ++Ashish >>>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >> > --bcaec51865f847d36a04ef89fe5d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

...And do all three nodes appear in the NameNode and YARN we= b user interfaces?
Chris

On Jan 9, 2014 7:46 AM, "Ashish Jain" = <ashjain2@gmail.com> wrote:=
Another point to add here 10.12.11.210 is the host which h= as everything running including a slave datanode. Data was also distributed= this host as well as the jar file. Following are running on 10.12.11.210
7966 DataNode
8480 NodeManager
8353 ResourceManager
8141 Secon= daryNameNode
7834 NameNode


<= br>
On Thu, Jan 9, 2014 at 6:12 PM, Ashish Jain <= span dir=3D"ltr"><ashjain2@gmail.com> wrote:
Logs were updated only when= I copied the data. After copying the data there has been no updates on the= log files.


On Thu, Jan 9, 2014 at 5:08 PM, Chris Mawata <chris.mawata@gmail.c= om> wrote:

Do the logs on the three node= s contain anything interesting?
Chris

On Jan 9, 2014 3:47 AM, "Ashish Jain" = <ashjain2@gmail.= com> wrote:
Here is the block info for the record I distributed. = As can be seen only 10.12.11.210 has all the data and this is the node whic= h is serving all the request. Replicas are available with 209 as well as 21= 0

1073741857:=A0=A0=A0 =A0=A0=A0=A0 10.12.11.210:50010=A0=A0=A0 View Block Info=A0=A0=A0= =A0=A0=A0=A0 10.12= .11.209:50010=A0=A0=A0 View Block Info
1073741858:=A0=A0=A0 =A0=A0=A0=A0 10.12.11.210:50010=A0=A0=A0 View Block Info=A0=A0=A0 =A0= =A0=A0=A0 10.12.11.= 211:50010=A0=A0=A0 View Block Info
1073741859:=A0=A0=A0 =A0=A0=A0=A0 10.12.11.210:50010=A0=A0=A0 View Block Info=A0=A0=A0 =A0= =A0=A0=A0 10.12.11.= 209:50010=A0=A0=A0 View Block Info
1073741860:=A0=A0=A0 =A0=A0=A0=A0= 10.12.11.210:50010= =A0=A0=A0 View Block Info=A0=A0=A0 =A0=A0=A0=A0 10.12.11.211:50010=A0=A0=A0 View Block= Info
1073741861:=A0=A0=A0 =A0=A0=A0=A0 10.12.11.210:50010=A0=A0=A0 View Block Info=A0=A0=A0 =A0= =A0=A0=A0 10.12.11.= 209:50010=A0=A0=A0 View Block Info
1073741862:=A0=A0=A0 =A0=A0=A0=A0= 10.12.11.210:50010= =A0=A0=A0 View Block Info=A0=A0=A0 =A0=A0=A0=A0 10.12.11.209:50010=A0=A0=A0 View Block= Info
1073741863:=A0=A0=A0 =A0=A0=A0=A0 10.12.11.210:50010=A0=A0=A0 View Block Info=A0=A0=A0 =A0= =A0=A0=A0 10.12.11.= 209:50010=A0=A0=A0 View Block Info
1073741864:=A0=A0=A0 =A0=A0=A0=A0= 10.12.11.210:50010= =A0=A0=A0 View Block Info=A0=A0=A0 =A0=A0=A0=A0 10.12.11.209:50010=A0=A0=A0 View Block= Info
<= td>

<= td>
<= /tbody>




<= br>









































<= /td>





--Ashish


On Thu, Jan 9, 2014 at 2:11 PM, Ashish Jain <a= shjain2@gmail.com> wrote:
Hello Chris,

I have now a cluster w= ith 3 nodes and replication factor being 2. When I distribute a file I coul= d see that there are replica of data available in other nodes. However when= I run a map reduce job again only one node is serving all the request :(. = Can you or anyone please provide some more inputs.

Thanks
Ashish
<= /div>


On Wed, Jan 8, 2014 at 7:16 PM, Chris Mawata <chris.mawata@gmail.co= m> wrote:

2 nodes and replication facto= r of 2 results in a replica of each block present on each node. This would = allow the possibility that a single node would do the work and yet be data = local.=A0 It will probably happen if that single node has the needed capaci= ty.=A0 More nodes than the replication factor are needed to force distribut= ion of the processing.
Chris

On Jan 8, 2014 7:35 AM, "Ashish Jain" = <ashjain2@gmail.= com> wrote:
Guys,

I am sure that only one node = is being used. I just know ran the job again and could see that CPU usage o= nly for one server going high other server CPU usage remains constant and h= ence it means other node is not being used. Can someone help me to debug th= is issue?

++Ashish


On Wed, Jan 8, 2014 at 5:04 PM, Ashish Jain <ashjain2@gm= ail.com> wrote:
Hello All,
I have a 2 node hadoop cluster running with a replication facto= r of 2. I have a file of size around 1 GB which when copied to HDFS is repl= icated to both the nodes. Seeing the block info I can see the file has been= subdivided into 8 parts which means it has been subdivided into 8 blocks e= ach of size 128 MB.=A0 I use this file as input to run the word count progr= am. Some how I feel only one node is doing all the work and the code is not= distributed to other node. How can I make sure code is distributed to both= the nodes? Also is there a log or GUI which can be used for this?
Please note I am using the latest stable release that is 2.2.0.=

++Ashish





--bcaec51865f847d36a04ef89fe5d--