Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 376 invoked from network); 1 Mar 2011 14:08:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Mar 2011 14:08:16 -0000 Received: (qmail 12286 invoked by uid 500); 1 Mar 2011 14:08:14 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 11995 invoked by uid 500); 1 Mar 2011 14:08:09 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 11987 invoked by uid 99); 1 Mar 2011 14:08:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2011 14:08:08 +0000 X-ASF-Spam-Status: No, hits=3.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sharmabiks.07@gmail.com designates 209.85.210.176 as permitted sender) Received: from [209.85.210.176] (HELO mail-iy0-f176.google.com) (209.85.210.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2011 14:08:01 +0000 Received: by iyj12 with SMTP id 12so5116721iyj.35 for ; Tue, 01 Mar 2011 06:07:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=1fB5atey3/MOEPeOrXirWbtwDdMhLxdgKB11PKyghag=; b=e9Ty9CFPB06BciGwpHZ6/u1g1R8Ld9kmMnYI1SAqRoILFJmCKu7jMCcZidt8S1u8vw GAjMtY34fknFMCkvaFlBJfFusSTFkihR5ywTnPK4oOIFbktEnm4fSEt/VvORfVtbokCA 0Bm2D3wgRcy4lQhBXKsKhudkf82btaVNcuUHo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=JNbSNGMap1UbTDEFUNe4/sSGgFoT4soBo5a3QHJSbXgDcYXvFOjfVj+fgmlKHV3XSR b1W/79E7nGWmjv0kFg9F+TVxP7Jvs0Tk7jNQhA5Nt1PlmL68LLqFk7YnHu7N+zH4qI1g cohSCir+yuVj7iR05VEVdfGkHVDrTXKbuStXI= MIME-Version: 1.0 Received: by 10.42.108.66 with SMTP id g2mr6665831icp.471.1298988460518; Tue, 01 Mar 2011 06:07:40 -0800 (PST) Received: by 10.231.35.202 with HTTP; Tue, 1 Mar 2011 06:07:40 -0800 (PST) In-Reply-To: <1915850408914554538@unknownmsgid> References: <1915850408914554538@unknownmsgid> Date: Tue, 1 Mar 2011 09:07:40 -0500 Message-ID: Subject: Re: TaskTracker not starting on all nodes From: bikash sharma To: common-user@hadoop.apache.org Cc: James Seigel Content-Type: multipart/alternative; boundary=485b397dcd5fc1ad64049d6c5124 X-Virus-Checked: Checked by ClamAV on apache.org --485b397dcd5fc1ad64049d6c5124 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi James, Sorry for the late response. No, the same problem persists. I reformatted HDFS, stopped mapred and hdfs daemons and restarted them (using start-dfs.s= h and start-mapred.sh from master node). But surprisingly out of 4 nodes cluster, two nodes have TaskTracker running while other two do not have TaskTrackers on them (verified using jps). I guess since I have the Hadoop installed on shared storage, that might be the issue? Btw, how do I start the services independently on each node? -bikash On Sun, Feb 27, 2011 at 11:05 PM, James Seigel wrote: > .... Did you get it working? What was the fix? > > Sent from my mobile. Please excuse the typos. > > On 2011-02-27, at 8:43 PM, Simon wrote: > > > Hey Bikash, > > > > Maybe you can manually start a tasktracker on the node and see if ther= e > are > > any error messages. Also, don't forget to check your configure files fo= r > > mapreduce and hdfs and make sure datanode can start successfully first. > > After all these steps, you can submit a job on the master node and see = if > > there are any communication between these failed nodes and the master > node. > > Post your error messages here if possible. > > > > HTH. > > Simon - > > > > On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma >wrote: > > > >> Thanks James. Well all the config. files and shared keys are on a shar= ed > >> storage that is accessed by all the nodes in the cluster. > >> At times, everything runs fine on initialization, but at other times, > the > >> same problem persists, so was bit confused. > >> Also, checked the TaskTracker logs on those nodes, there does not seem > to > >> be > >> any error. > >> > >> -bikash > >> > >> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel wrote: > >> > >>> Maybe your ssh keys aren=92t distributed the same on each machine or = the > >>> machines aren=92t configured the same? > >>> > >>> J > >>> > >>> > >>> On 2011-02-26, at 8:25 AM, bikash sharma wrote: > >>> > >>>> Hi, > >>>> I have a 10 nodes Hadoop cluster, where I am running some benchmarks > >> for > >>>> experiments. > >>>> Surprisingly, when I initialize the Hadoop cluster > >>>> (hadoop/bin/start-mapred.sh), in many instances, only some nodes hav= e > >>>> TaskTracker process up (seen using jps), while other nodes do not ha= ve > >>>> TaskTrackers. Could anyone please explain? > >>>> > >>>> Thanks, > >>>> Bikash > >>> > >>> > >> > > > > > > > > -- > > Regards, > > Simon > --485b397dcd5fc1ad64049d6c5124--