Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
Message-ID: <4FBEE4C5.7030302@occamsmachete.com>
Date: Thu, 24 May 2012 18:47:49 -0700
From: Pat Ferrel <pat@occamsmachete.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7;
 rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: common-user@hadoop.apache.org
Subject: Re: 3 machine cluster trouble
References: <1337805492.48520.YahooMailNeo@web193506.mail.sg3.yahoo.com>
 <4FBD6834.6070608@occamsmachete.com>
 <CAPALeRt8HGh3rNvPHHNE7UpxPf1n8mT=DDUVOfO61FYo811CuA@mail.gmail.com>
 <4FBE7F45.80702@occamsmachete.com>
In-Reply-To: <4FBE7F45.80702@occamsmachete.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Oops, after a few trials I got an ERROR for incompatible builds 
versions. Copied code from the master, reformatted, et voila.

On 5/24/12 11:34 AM, Pat Ferrel wrote:
> ok, so all nodes are configured the same except for master/slave 
> differences. They are all running hdfs all daemons seem to be running 
> when I do a start-all.sh from the master. However the master 
> Map/Reduce Administration page shows only two live nodes. The HDFS 
> page shows 3.
>
> Looking at the log files on the new slave node I see no outright 
> errors but see this in the tasktracker log file. All machines have 8G 
> memory. I think the important part below is TaskTracker's 
> totalMemoryAllottedForTasks is -1. I've searched for others with this 
> problem but haven't found something for my case, which is just trying 
> to startup. No tasks have been run.
>
> 2012-05-24 11:20:46,786 INFO org.apache.hadoop.mapred.TaskTracker: 
> Starting tracker tracker_occam3:localhost/127.0.0.1:45700
> 2012-05-24 11:20:46,792 INFO org.apache.hadoop.mapred.TaskTracker: 
> Starting thread: Map-events fetcher for all reduce tasks on 
> tracker_occam3:localhost/127.0.0.1:45700
> 2012-05-24 11:20:46,792 INFO org.apache.hadoop.mapred.TaskTracker:  
> Using ResourceCalculatorPlugin : 
> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@5abd09e8
> 2012-05-24 11:20:46,795 WARN org.apache.hadoop.mapred.TaskTracker: 
> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is 
> disabled.
> 2012-05-24 11:20:46,795 INFO org.apache.hadoop.mapred.IndexCache: 
> IndexCache created with max memory = 10485760
> 2012-05-24 11:20:46,800 INFO org.apache.hadoop.mapred.TaskTracker: 
> Shutting down: Map-events fetcher for all reduce tasks on 
> tracker_occam3:localhost/127.0.0.1:45700
> 2012-05-24 11:20:46,800 INFO 
> org.apache.hadoop.filecache.TrackerDistributedCacheManager: Cleanup...
> java.lang.InterruptedException: sleep interrupted
>     at java.lang.Thread.sleep(Native Method)
>     at 
> org.apache.hadoop.filecache.TrackerDistributedCacheManager$CleanupThread.run(TrackerDistributedCacheManager.java:926)
> 2012-05-24 11:20:46,900 INFO org.apache.hadoop.ipc.Server: Stopping 
> server on 45700
> 2012-05-24 11:20:46,901 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 3 on 45700: exiting
> 2012-05-24 11:20:46,901 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 1 on 45700: exiting
> 2012-05-24 11:20:46,902 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 2 on 45700: exiting
> 2012-05-24 11:20:46,902 INFO org.apache.hadoop.ipc.Server: Stopping 
> IPC Server listener on 45700
> 2012-05-24 11:20:46,901 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 0 on 45700: exiting
> 2012-05-24 11:20:46,904 INFO 
> org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
> 2012-05-24 11:20:46,904 INFO org.apache.hadoop.mapred.TaskTracker: 
> Shutting down StatusHttpServer
> 2012-05-24 11:20:46,904 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 7 on 45700: exiting
> 2012-05-24 11:20:46,903 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 6 on 45700: exiting
> 2012-05-24 11:20:46,903 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 4 on 45700: exiting
> 2012-05-24 11:20:46,904 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 5 on 45700: exiting
> 2012-05-24 11:20:46,904 INFO org.apache.hadoop.ipc.Server: Stopping 
> IPC Server Responder
> 2012-05-24 11:20:46,909 INFO org.mortbay.log: Stopped 
> SelectChannelConnector@0.0.0.0:50060
>
>
>
> On 5/23/12 3:55 PM, James Warren wrote:
>> Hi Pat -
>>
>> The setting for hadoop.tmp.dir is used both locally and on HDFS and
>> therefore should be consistent across your cluster.
>>
>> http://stackoverflow.com/questions/2354525/what-should-be-hadoop-tmp-dir
>>
>> cheers,
>> -James
>>
>> On Wed, May 23, 2012 at 3:44 PM, Pat Ferrel<pat@occamsmachete.com>  
>> wrote:
>>
>>> I have a two machine cluster and am adding a new machine. The new 
>>> node has
>>> a different location for hadoop.tmp.dir than the other two nodes and
>>> refuses to start the datanode when started in the cluster. When I 
>>> change
>>> the location pointed to by hadoop.tmp.dir to be the same on all 
>>> machines it
>>> starts up fine on all machines.
>>>
>>> Shouldn't I be able to have the master and slave1 set as:
>>> <property>
>>> <name>hadoop.tmp.dir</name>
>>> <value>/app/hadoop/tmp</value>
>>> <description>A base for other temporary directories.</description>
>>> </property>
>>>
>>> And slave2 set as:
>>> <property>
>>> <name>hadoop.tmp.dir</name>
>>> <value>/media/d2/app/hadoop/**tmp</value>
>>> <description>A base for other temporary directories.</description>
>>> </property>
>>>
>>> ??? Slave2 runs standalone in single node mode just fine. Using 
>>> 0.20.205.
>>>