Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7A90098A1 for ; Fri, 25 May 2012 01:48:23 +0000 (UTC) Received: (qmail 82142 invoked by uid 500); 25 May 2012 01:48:19 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 82080 invoked by uid 500); 25 May 2012 01:48:18 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 82070 invoked by uid 99); 25 May 2012 01:48:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 May 2012 01:48:18 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.210.48] (HELO mail-pz0-f48.google.com) (209.85.210.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 May 2012 01:48:13 +0000 Received: by dadz8 with SMTP id z8so804407dad.35 for ; Thu, 24 May 2012 18:47:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding :x-gm-message-state; bh=KUaQGiUXOYpswpF14T4vIm0xGL/ElJRabUiYtsnzKbU=; b=ELhoWoK8HH838xnopp45HERPctkuFuioFLqQd3HJ0S2jNu7Is7w9i9mrMO9BnOQbq1 Ji0HTK3c75meifjhzGuzmGJ5hfJ7DLepF1HwZXFaqLSf77jOgD2akl/td2id/hgsNDIU ttDswesOeo66oZGHrBiW6HdI8LygF53j67YVob6G1kaTRdvosRO33UTfagsMaBQXs4y5 thoixyxP+tMw1uXEsKpWMw32kvmhXqkguypCNKRRHIep/VP2KZg9JsfMIQf3uOiyeySs bb/GinCLa083dUg00KlqvnSIMo77HJus6Gl/+y2lTcEC5Ksc8SvLlugte1tkhnwqdjQn ZU0Q== Received: by 10.68.203.7 with SMTP id km7mr15755002pbc.7.1337910472898; Thu, 24 May 2012 18:47:52 -0700 (PDT) Received: from [192.168.0.100] (c-98-247-244-56.hsd1.wa.comcast.net. [98.247.244.56]) by mx.google.com with ESMTPS id x1sm7278077pbp.50.2012.05.24.18.47.51 (version=SSLv3 cipher=OTHER); Thu, 24 May 2012 18:47:52 -0700 (PDT) Message-ID: <4FBEE4C5.7030302@occamsmachete.com> Date: Thu, 24 May 2012 18:47:49 -0700 From: Pat Ferrel User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: common-user@hadoop.apache.org Subject: Re: 3 machine cluster trouble References: <1337805492.48520.YahooMailNeo@web193506.mail.sg3.yahoo.com> <4FBD6834.6070608@occamsmachete.com> <4FBE7F45.80702@occamsmachete.com> In-Reply-To: <4FBE7F45.80702@occamsmachete.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Gm-Message-State: ALoCoQnVih0EBxShr2VcpEJFndo4jjb5aC2FaKnYdzedYoR4QX48xNK9G6UHLT85OeJ/byWbRzaH X-Virus-Checked: Checked by ClamAV on apache.org Oops, after a few trials I got an ERROR for incompatible builds versions. Copied code from the master, reformatted, et voila. On 5/24/12 11:34 AM, Pat Ferrel wrote: > ok, so all nodes are configured the same except for master/slave > differences. They are all running hdfs all daemons seem to be running > when I do a start-all.sh from the master. However the master > Map/Reduce Administration page shows only two live nodes. The HDFS > page shows 3. > > Looking at the log files on the new slave node I see no outright > errors but see this in the tasktracker log file. All machines have 8G > memory. I think the important part below is TaskTracker's > totalMemoryAllottedForTasks is -1. I've searched for others with this > problem but haven't found something for my case, which is just trying > to startup. No tasks have been run. > > 2012-05-24 11:20:46,786 INFO org.apache.hadoop.mapred.TaskTracker: > Starting tracker tracker_occam3:localhost/127.0.0.1:45700 > 2012-05-24 11:20:46,792 INFO org.apache.hadoop.mapred.TaskTracker: > Starting thread: Map-events fetcher for all reduce tasks on > tracker_occam3:localhost/127.0.0.1:45700 > 2012-05-24 11:20:46,792 INFO org.apache.hadoop.mapred.TaskTracker: > Using ResourceCalculatorPlugin : > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@5abd09e8 > 2012-05-24 11:20:46,795 WARN org.apache.hadoop.mapred.TaskTracker: > TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is > disabled. > 2012-05-24 11:20:46,795 INFO org.apache.hadoop.mapred.IndexCache: > IndexCache created with max memory = 10485760 > 2012-05-24 11:20:46,800 INFO org.apache.hadoop.mapred.TaskTracker: > Shutting down: Map-events fetcher for all reduce tasks on > tracker_occam3:localhost/127.0.0.1:45700 > 2012-05-24 11:20:46,800 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Cleanup... > java.lang.InterruptedException: sleep interrupted > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.filecache.TrackerDistributedCacheManager$CleanupThread.run(TrackerDistributedCacheManager.java:926) > 2012-05-24 11:20:46,900 INFO org.apache.hadoop.ipc.Server: Stopping > server on 45700 > 2012-05-24 11:20:46,901 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 3 on 45700: exiting > 2012-05-24 11:20:46,901 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 1 on 45700: exiting > 2012-05-24 11:20:46,902 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 2 on 45700: exiting > 2012-05-24 11:20:46,902 INFO org.apache.hadoop.ipc.Server: Stopping > IPC Server listener on 45700 > 2012-05-24 11:20:46,901 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 0 on 45700: exiting > 2012-05-24 11:20:46,904 INFO > org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down > 2012-05-24 11:20:46,904 INFO org.apache.hadoop.mapred.TaskTracker: > Shutting down StatusHttpServer > 2012-05-24 11:20:46,904 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 7 on 45700: exiting > 2012-05-24 11:20:46,903 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 6 on 45700: exiting > 2012-05-24 11:20:46,903 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 4 on 45700: exiting > 2012-05-24 11:20:46,904 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 5 on 45700: exiting > 2012-05-24 11:20:46,904 INFO org.apache.hadoop.ipc.Server: Stopping > IPC Server Responder > 2012-05-24 11:20:46,909 INFO org.mortbay.log: Stopped > SelectChannelConnector@0.0.0.0:50060 > > > > On 5/23/12 3:55 PM, James Warren wrote: >> Hi Pat - >> >> The setting for hadoop.tmp.dir is used both locally and on HDFS and >> therefore should be consistent across your cluster. >> >> http://stackoverflow.com/questions/2354525/what-should-be-hadoop-tmp-dir >> >> cheers, >> -James >> >> On Wed, May 23, 2012 at 3:44 PM, Pat Ferrel >> wrote: >> >>> I have a two machine cluster and am adding a new machine. The new >>> node has >>> a different location for hadoop.tmp.dir than the other two nodes and >>> refuses to start the datanode when started in the cluster. When I >>> change >>> the location pointed to by hadoop.tmp.dir to be the same on all >>> machines it >>> starts up fine on all machines. >>> >>> Shouldn't I be able to have the master and slave1 set as: >>> >>> hadoop.tmp.dir >>> /app/hadoop/tmp >>> A base for other temporary directories. >>> >>> >>> And slave2 set as: >>> >>> hadoop.tmp.dir >>> /media/d2/app/hadoop/**tmp >>> A base for other temporary directories. >>> >>> >>> ??? Slave2 runs standalone in single node mode just fine. Using >>> 0.20.205. >>>