hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@yahoo-inc.com>
Subject Re: problem starting cdh3b2 jobtracker
Date Fri, 06 Aug 2010 17:47:11 GMT
Please keep discussions of CDH confined to Cloudera's email lists.

On Aug 6, 2010, at 3:43 AM, Harsh J wrote:

> java.io.IOException: Cannot create toBeDeleted in /data1/mapred/local
>
> This line points at the solution actually. In earlier versions of CDH
> if the list of local mapred directories had false ones (like say the
> jobtracker machine not having 2 disks like all the tasktracking
> machines and it not being in the slaves list either), it used to
> ignore it. Now it doesn't seem to and instead tries to operate things
> upon it? Looks like a major bug Cloudera folks! Encountered this using
> CDH3 +320. Not using my jobtracker machine to perform tasks as well.
>
> It gets resolved after you validate the mapred local directory list on
> the job tracker machine's config alone. However, this would lead to
> issues with conf-syncing between nodes if it acts this way forever.
>
> On Fri, Jul 2, 2010 at 8:32 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> We installed cdh3b2 0.20.2+320 and saw some strange error in  
>> jobtracker log:
>>
>> 2010-07-02 01:49:31,977 INFO org.apache.hadoop.mapred.JobTracker:  
>> JobTracker
>> up at: 9001
>> 2010-07-02 01:49:31,977 INFO org.apache.hadoop.mapred.JobTracker:  
>> JobTracker
>> webserver: 50030
>> 2010-07-02 01:49:31,988 WARN org.apache.hadoop.mapred.JobTracker:  
>> Error
>> starting tracker: java.io.IOException: Cannot create toBeDeleted in
>> /data1/mapred/local
>>    at
>> org 
>> .apache 
>> .hadoop.util.MRAsyncDiskService.<init>(MRAsyncDiskService.java:85)
>>    at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java: 
>> 1688)
>>    at  
>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:199)
>>    at  
>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:191)
>>    at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3765)
>>
>> 2010-07-02 01:49:32,990 INFO org.apache.hadoop.mapred.JobTracker:  
>> Scheduler
>> configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT,
>> limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1)
>> 2010-07-02 01:49:32,991 FATAL org.apache.hadoop.mapred.JobTracker:
>> java.net.BindException: Problem binding to
>> sjc1-hadoop0.sjc1.ciq.com/10.201.8.204:9001<http://sjc1-hadoop0.sjc1.carrieriq.com/10.201.8.204:9001

>> >:
>> Address already in use
>>    at org.apache.hadoop.ipc.Server.bind(Server.java:198)
>>    at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:261)
>>    at org.apache.hadoop.ipc.Server.<init>(Server.java:1043)
>>    at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:492)
>>    at org.apache.hadoop.ipc.RPC.getServer(RPC.java:454)
>>    at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java: 
>> 1628)
>>    at  
>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:199)
>>    at  
>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:191)
>>    at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3765)
>> Caused by: java.net.BindException: Address already in use
>>    at sun.nio.ch.Net.bind(Native Method)
>>    at
>> sun 
>> .nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java: 
>> 119)
>>    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java: 
>> 59)
>>    at org.apache.hadoop.ipc.Server.bind(Server.java:196)
>>    ... 8 more
>>
>> 2010-07-02 01:49:32,992 INFO org.apache.hadoop.mapred.JobTracker:
>> SHUTDOWN_MSG:
>>
>> But 9001 wasn't used:
>> [sjc1-hadoop0.sjc1:hadoop 25618]netstat -nta | grep 9001
>> [sjc1-hadoop0.sjc1:hadoop 25619]netstat -nta | grep 9000
>> tcp        0      0 10.201.8.204:9000           0.0.0.0:*
>> LISTEN
>> tcp        0      0 10.201.8.204:9000           10.201.8.214:4223
>> ESTABLISHED
>> tcp        0      0 10.201.8.204:9000           10.201.8.212:49074
>> ESTABLISHED
>> tcp        0      0 10.201.8.204:9000           10.201.8.206:11910
>> ESTABLISHED
>> tcp        0      0 10.201.8.204:9000           10.201.8.210:62611
>> ESTABLISHED
>> tcp        0      0 10.201.8.204:9000           10.201.8.213:1299
>> ESTABLISHED
>> tcp        0      0 10.201.8.204:9000           10.201.8.205:9756
>> ESTABLISHED
>> tcp        0      0 10.201.8.204:9000           10.201.8.207:59207
>> ESTABLISHED
>>
>> Here is output from ifconfig:
>> bond0     Link encap:Ethernet  HWaddr 00:30:48:60:53:94
>>          inet addr:10.201.8.204  Bcast:10.201.8.255  Mask: 
>> 255.255.255.0
>>          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>>          RX packets:351496605 errors:0 dropped:1015 overruns:0  
>> frame:0
>>          TX packets:178144953 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:119420730164 (111.2 GiB)  TX bytes:120002123131  
>> (111.7
>> GiB)
>>
>> eth0      Link encap:Ethernet  HWaddr 00:30:48:60:53:94
>>          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>>          RX packets:351496605 errors:0 dropped:1015 overruns:0  
>> frame:0
>>          TX packets:178144953 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:119420730164 (111.2 GiB)  TX bytes:120002123131  
>> (111.7
>> GiB)
>>          Interrupt:161
>>
>> eth1      Link encap:Ethernet  HWaddr 00:30:48:60:53:94
>>          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>>          Interrupt:169
>>
>> Has anyone encountered similar issue ?
>>
>
>
>
> -- 
> Harsh J
> www.harshj.com


Mime
View raw message