hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Björn-Elmar Macek <ma...@cs.uni-kassel.de>
Subject Re: DataNode and Tasttracker communication
Date Tue, 14 Aug 2012 11:25:30 GMT
Hi Michael and Mohammad,

thanks alot for your inpus!
i have pinged the people at the cluster in order to (eventually disable 
IPv6) and definetly check the ports corresponding to the appropriate 
machines. I will keep you updated.

Regards,
Elmar


Am 13.08.2012 22:39, schrieb Michael Segel:
>
> The key is to think about what can go wrong, but start with the low 
> hanging fruit.
>
> I mean you could be right, however you're jumping the gun and are over 
> looking simpler issues.
>
> The most common issue is that the networking traffic is being filtered.
> Of course since we're both diagnosing this with minimal information, 
> we're kind of shooting from the hip.
>
> This is why I'm asking if there is any networking traffic between the 
> nodes.  If you have partial communication, then focus on why you can't 
> see the specific traffic.
>
>
> On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <dontariq@gmail.com 
> <mailto:dontariq@gmail.com>> wrote:
>
>> Thank you so very much for the detailed response Michael. I'll keep 
>> the tip in mind. Please pardon my ignorance, as I am still in the 
>> learning phase.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel 
>> <michael_segel@hotmail.com <mailto:michael_segel@hotmail.com>> wrote:
>>
>>     0.0.0.0 means that the call is going to all interfaces on the
>>     machine.  (Shouldn't be an issue...)
>>
>>     IPv4 vs IPv6? Could be an issue, however OP says he can write
>>     data to DNs and they seem to communicate, therefore if its IPv6
>>     related, wouldn't it impact all traffic and not just a specific port?
>>     I agree... shut down IPv6 if you can.
>>
>>     I don't disagree with your assessment. I am just suggesting that
>>     before you do a really deep dive, you think about the more
>>     obvious stuff first.
>>
>>     There are a couple of other things... like do all of the
>>     /etc/hosts files on all of the machines match?
>>     Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>
>>     BTW, you said DNS in your response. if you're using DNS, then you
>>     don't really want to have much info in the /etc/hosts file except
>>     loopback and the server's IP address.
>>
>>     Looking at the problem OP is indicating some traffic works, while
>>     other traffic doesn't. Most likely something is blocking the
>>     ports. Iptables is the first place to look.
>>
>>     Just saying. ;-)
>>
>>
>>     On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <dontariq@gmail.com
>>     <mailto:dontariq@gmail.com>> wrote:
>>
>>>     Hi Michael,
>>>            I asked for hosts file because there seems to be some
>>>     loopback prob to me. The log shows that call is going at
>>>     0.0.0.0. Apart from what you have said, I think disabling IPv6
>>>     and making sure that there is no prob with the DNS resolution is
>>>     also necessary. Please correct me if I am wrong. Thank you.
>>>
>>>     Regards,
>>>         Mohammad Tariq
>>>
>>>
>>>
>>>     On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel
>>>     <michael_segel@hotmail.com <mailto:michael_segel@hotmail.com>>
>>>     wrote:
>>>
>>>         Based on your /etc/hosts output, why aren't you using DNS?
>>>
>>>         Outside of MapR, multihomed machines can be problematic.
>>>         Hadoop doesn't generally work well when you're not using the
>>>         FQDN or its alias.
>>>
>>>         The issue isn't the SSH, but if you go to the node which is
>>>         having trouble connecting to another node,  then try to ping
>>>         it, or some other general communication,  if it succeeds,
>>>         your issue is that the port you're trying to communicate
>>>         with is blocked.  Then its more than likely an ipconfig or
>>>         firewall issue.
>>>
>>>         On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>>>         <ema@cs.uni-kassel.de <mailto:ema@cs.uni-kassel.de>> wrote:
>>>
>>>>         Hi Michael,
>>>>
>>>>         well i can ssh from any node to any other without being
>>>>         prompted. The reason for this is, that my home dir is
>>>>         mounted in every server in the cluster.
>>>>
>>>>         If the machines are multihomed: i dont know. i could ask if
>>>>         this would be of importance.
>>>>
>>>>         Shall i?
>>>>
>>>>         Regards,
>>>>         Elmar
>>>>
>>>>         Am 13.08.12 14:59, schrieb Michael Segel:
>>>>>         If the nodes can communicate and distribute data, then the
>>>>>         odds are that the issue isn't going to be in his /etc/hosts.
>>>>>
>>>>>         A more relevant question is if he's running a firewall on
>>>>>         each of these machines?
>>>>>
>>>>>         A simple test... ssh to one node, ping other nodes and the
>>>>>         control nodes at random to see if they can see one
>>>>>         another. Then check to see if there is a firewall running
>>>>>         which would limit the types of traffic between nodes.
>>>>>
>>>>>         One other side note... are these machines multi-homed?
>>>>>
>>>>>         On Aug 13, 2012, at 7:51 AM, Mohammad Tariq
>>>>>         <dontariq@gmail.com <mailto:dontariq@gmail.com>>
wrote:
>>>>>
>>>>>>         Hello there,
>>>>>>
>>>>>>          Could you please share your /etc/hosts file, if you
>>>>>>         don't mind.
>>>>>>
>>>>>>         Regards,
>>>>>>          Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>>         On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>>>         <macek@cs.uni-kassel.de <mailto:macek@cs.uni-kassel.de>>
>>>>>>         wrote:
>>>>>>
>>>>>>             Hi,
>>>>>>
>>>>>>             i am currently trying to run my hadoop program on a
>>>>>>             cluster. Sadly though my datanodes and tasktrackers
>>>>>>             seem to have difficulties with their communication as
>>>>>>             their logs say:
>>>>>>             * Some datanodes and tasktrackers seem to have
>>>>>>             portproblems of some kind as it can be seen in the
>>>>>>             logs below. I wondered if this might be due to
>>>>>>             reasons correllated with the localhost entry in
>>>>>>             /etc/hosts as you can read in alot of posts with
>>>>>>             similar errors, but i checked the file neither
>>>>>>             localhost nor 127.0.0.1/127.0.1.1
>>>>>>             <http://127.0.0.1/127.0.1.1> is bound there.
>>>>>>             (although you can ping localhost... the technician of
>>>>>>             the cluster said he'd be looking for the mechanics
>>>>>>             resolving localhost)
>>>>>>             * The other nodes can not speak with the namenode and
>>>>>>             jobtracker (its-cs131). Although it is absolutely not
>>>>>>             clear, why this is happening: the "dfs -put" i do
>>>>>>             directly before the job is running fine, which seems
>>>>>>             to imply that communication between those servers is
>>>>>>             working flawlessly.
>>>>>>
>>>>>>             Is there any reason why this might happen?
>>>>>>
>>>>>>
>>>>>>             Regards,
>>>>>>             Elmar
>>>>>>
>>>>>>             LOGS BELOW:
>>>>>>
>>>>>>             \____Datanodes
>>>>>>
>>>>>>             After successfully putting the data to hdfs (at this
>>>>>>             point i thought namenode and datanodes have to
>>>>>>             communicate), i get the following errors when
>>>>>>             starting the job:
>>>>>>
>>>>>>             There are 2 kinds of logs i found: the first one is
>>>>>>             big (about 12MB) and looks like this:
>>>>>>             ############################### LOG TYPE 1
>>>>>>             ############################################################
>>>>>>             2012-08-13 08:23:27,331 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 0
time(s).
>>>>>>             2012-08-13 08:23:28,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 1
time(s).
>>>>>>             2012-08-13 08:23:29,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 2
time(s).
>>>>>>             2012-08-13 08:23:30,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 3
time(s).
>>>>>>             2012-08-13 08:23:31,333 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 4
time(s).
>>>>>>             2012-08-13 08:23:32,333 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 5
time(s).
>>>>>>             2012-08-13 08:23:33,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 6
time(s).
>>>>>>             2012-08-13 08:23:34,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 7
time(s).
>>>>>>             2012-08-13 08:23:35,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 8
time(s).
>>>>>>             2012-08-13 08:23:36,335 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 9
time(s).
>>>>>>             2012-08-13 08:23:36,335 WARN
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             java.net.ConnectException: Call to
>>>>>>             its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/> failed on connection
>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>                 at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>>                 at java.lang.Thread.run(Thread.java:619)
>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>                 ... 5 more
>>>>>>
>>>>>>             ... (this continues til the end of the log)
>>>>>>
>>>>>>             The second is short kind:
>>>>>>             ########################### LOG TYPE 2
>>>>>>             ############################################################
>>>>>>             2012-08-13 00:59:19,038 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             STARTUP_MSG:
>>>>>>             /************************************************************
>>>>>>             STARTUP_MSG: Starting DataNode
>>>>>>             STARTUP_MSG: host =
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             STARTUP_MSG: args = []
>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>             STARTUP_MSG: build =
>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>             23:58:21 UTC 2012
>>>>>>             ************************************************************/
>>>>>>             2012-08-13 00:59:19,203 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>>>             properties from hadoop-metrics2.properties
>>>>>>             2012-08-13 00:59:19,216 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source MetricsSystem,sub=Stats registered.
>>>>>>             2012-08-13 00:59:19,217 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>             2012-08-13 00:59:19,218 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             DataNode metrics system started
>>>>>>             2012-08-13 00:59:19,306 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ugi registered.
>>>>>>             2012-08-13 00:59:19,346 INFO
>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>             native-hadoop library
>>>>>>             2012-08-13 00:59:20,482 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 0
time(s).
>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>             org.apache.hadoop.hdfs.server.common.Storage: Storage
>>>>>>             directory /home/work/bmacek/hadoop/hdfs/slave is not
>>>>>>             formatted.
>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>             Formatting ...
>>>>>>             2012-08-13 00:59:21,787 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             Registered FSDatasetStatusMBean
>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>             Shutting down all async disk service threads...
>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>             All async disk service threads have been shut down.
>>>>>>             2012-08-13 00:59:21,898 ERROR
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             java.net.BindException: Problem binding to
>>>>>>             /0.0.0.0:50010 <http://0.0.0.0:50010/> : Address
>>>>>>             already in use
>>>>>>                 at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>>>             Caused by: java.net.BindException: Address already in
use
>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>                 at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>>                 ... 7 more
>>>>>>
>>>>>>             2012-08-13 00:59:21,899 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             SHUTDOWN_MSG:
>>>>>>             /************************************************************
>>>>>>             SHUTDOWN_MSG: Shutting down DataNode at
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             ************************************************************/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             \_____TastTracker
>>>>>>             With TaskTrackers it is the same: there are 2 kinds.
>>>>>>             ############################### LOG TYPE 1
>>>>>>             ############################################################
>>>>>>             2012-08-13 02:09:54,645 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Resending
>>>>>>             'status' to 'its-cs131' with reponseId '879
>>>>>>             2012-08-13 02:09:55,646 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 0
time(s).
>>>>>>             2012-08-13 02:09:56,646 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 1
time(s).
>>>>>>             2012-08-13 02:09:57,647 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 2
time(s).
>>>>>>             2012-08-13 02:09:58,647 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 3
time(s).
>>>>>>             2012-08-13 02:09:59,648 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 4
time(s).
>>>>>>             2012-08-13 02:10:00,648 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 5
time(s).
>>>>>>             2012-08-13 02:10:01,649 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 6
time(s).
>>>>>>             2012-08-13 02:10:02,649 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 7
time(s).
>>>>>>             2012-08-13 02:10:03,650 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 8
time(s).
>>>>>>             2012-08-13 02:10:04,650 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 9
time(s).
>>>>>>             2012-08-13 02:10:04,651 ERROR
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Caught
>>>>>>             exception: java.net.ConnectException: Call to
>>>>>>             its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/> failed on connection
>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>>>             Source)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>                 ... 6 more
>>>>>>
>>>>>>
>>>>>>             ########################### LOG TYPE 2
>>>>>>             ############################################################
>>>>>>             2012-08-13 00:59:24,376 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>>>             /************************************************************
>>>>>>             STARTUP_MSG: Starting TaskTracker
>>>>>>             STARTUP_MSG: host =
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             STARTUP_MSG: args = []
>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>             STARTUP_MSG: build =
>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>             23:58:21 UTC 2012
>>>>>>             ************************************************************/
>>>>>>             2012-08-13 00:59:24,569 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>>>             properties from hadoop-metrics2.properties
>>>>>>             2012-08-13 00:59:24,626 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source MetricsSystem,sub=Stats registered.
>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             TaskTracker metrics system started
>>>>>>             2012-08-13 00:59:24,950 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ugi registered.
>>>>>>             2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging
>>>>>>             to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log)
>>>>>>             via org.mortbay.log.Slf4jLog
>>>>>>             2012-08-13 00:59:25,206 INFO
>>>>>>             org.apache.hadoop.http.HttpServer: Added global
>>>>>>             filtersafety
>>>>>>             (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>>>             2012-08-13 00:59:25,232 INFO
>>>>>>             org.apache.hadoop.mapred.TaskLogsTruncater:
>>>>>>             Initializing logs' truncater with mapRetainSize=-1
>>>>>>             and reduceRetainSize=-1
>>>>>>             2012-08-13 00:59:25,237 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             tasktracker with owner as bmacek
>>>>>>             2012-08-13 00:59:25,239 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Good mapred
>>>>>>             local directories are:
>>>>>>             /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>>>             2012-08-13 00:59:25,244 INFO
>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>             native-hadoop library
>>>>>>             2012-08-13 00:59:25,255 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source jvm registered.
>>>>>>             2012-08-13 00:59:25,256 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source TaskTrackerMetrics registered.
>>>>>>             2012-08-13 00:59:25,279 INFO
>>>>>>             org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source RpcDetailedActivityForPort54850
>>>>>>             registered.
>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source RpcActivityForPort54850 registered.
>>>>>>             2012-08-13 00:59:25,287 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server Responder:
>>>>>>             starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server listener on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 0 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 1 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker up
>>>>>>             at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 3 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 2 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             tracker
>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>             <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:26,321 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 0
time(s).
>>>>>>             2012-08-13 00:59:38,104 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             thread: Map-events fetcher for all reduce tasks on
>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>             <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:38,120 INFO
>>>>>>             org.apache.hadoop.util.ProcessTree: setsid exited
>>>>>>             with exit code 0
>>>>>>             2012-08-13 00:59:38,134 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Using
>>>>>>             ResourceCalculatorPlugin :
>>>>>>             org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>>>             2012-08-13 00:59:38,137 WARN
>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>>>             totalMemoryAllottedForTasks is -1. TaskMemoryManager
>>>>>>             is disabled.
>>>>>>             2012-08-13 00:59:38,145 INFO
>>>>>>             org.apache.hadoop.mapred.IndexCache: IndexCache
>>>>>>             created with max memory = 10485760
>>>>>>             2012-08-13 00:59:38,158 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ShuffleServerMetrics registered.
>>>>>>             2012-08-13 00:59:38,161 INFO
>>>>>>             org.apache.hadoop.http.HttpServer: Port returned by
>>>>>>             webServer.getConnectors()[0].getLocalPort() before
>>>>>>             open() is -1. Opening the listener on 50060
>>>>>>             2012-08-13 00:59:38,161 ERROR
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Can not start
>>>>>>             task tracker because java.net.BindException: Address
>>>>>>             already in use
>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>                 at
>>>>>>             org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>>                 at
>>>>>>             org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>>>
>>>>>>             2012-08-13 00:59:38,163 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>>>             /************************************************************
>>>>>>             SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             ************************************************************/
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>


Mime
View raw message