hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Caetano Sauer <caetanosa...@gmail.com>
Subject Re: Job does not run with EOFException
Date Wed, 29 Aug 2012 08:22:09 GMT
I am able to browse the web UI and telnet/netcat the tasktracker host and
port, so the connection is being established. Is there any way I can
confirm whether it is really some kind of version conflict? The EOF when
doing readInt() seems like a protocol incompatibility.

By the way, the tastracker is killed every time this happens, and I am left
with some kind of JVM dump in a hs_err_*.log file. The tasktracker logs
show nothing.

Some facts that may help find the problem are:
1) I am not running with a "hadoop" user as it is usually suggested in
tutorials
2) There is an older version of hadoop which I am absolutely sure is not
running, and even so, it is configured on different ports.

Thank you for your help and regards,
Caetano Sauer

On Wed, Aug 29, 2012 at 10:08 AM, Hemanth Yamijala <yhemanth@gmail.com>wrote:

> Are you able to browse the web UI for the jobtracker. If not
> configured separately, it should be at hostname:50030 ? It would also
> help if you can telnet to the jobtracker server port and see if it is
> able to connect.
>
> Thanks
> hemanth
>
> On Tue, Aug 28, 2012 at 7:23 PM, Caetano Sauer <caetanosauer@gmail.com>
> wrote:
> > The host on top of the stack trace contains the host and port I defined
> on
> > mapred.job.tracker in mapred-site.xml
> >
> > Other than that, I don't know how to verify what you asked me. Any tips?
> >
> >
> > On Tue, Aug 28, 2012 at 3:47 PM, Harsh J <harsh@cloudera.com> wrote:
> >>
> >> Are you sure you're reaching the right port for your JobTrcker?
> >>
> >> On Tue, Aug 28, 2012 at 7:15 PM, Caetano Sauer <caetanosauer@gmail.com>
> >> wrote:
> >> > Hello,
> >> >
> >> > I am getting the following error when trying to execute a hadoop job
> on
> >> > a
> >> > 5-node cluster:
> >> >
> >> > Caused by: java.io.IOException: Call to *** failed on local exception:
> >> > java.io.EOFException
> >> > at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)
> >> > at org.apache.hadoop.ipc.Client.call(Client.java:1071)
> >> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
> >> > at org.apache.hadoop.mapred.$Proxy2.submitJob(Unknown Source)
> >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921)
> >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
> >> > at java.security.AccessController.doPrivileged(Native Method)
> >> > at javax.security.auth.Subject.doAs(Subject.java:396)
> >> > at
> >> >
> >> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
> >> > at
> >> >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
> >> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
> >> > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
> >> > ... 9 more
> >> > Caused by: java.io.EOFException
> >> > at java.io.DataInputStream.readInt(DataInputStream.java:375)
> >> > at
> >> >
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800)
> >> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745)
> >> >
> >> > (My jobtracker host was substituted by ***)
> >> >
> >> > After 3 hours of searching, everything points to an incompatibility
> >> > between
> >> > the hadoop versions of the client and the server, but this is not the
> >> > case,
> >> > since I can run the job on a pseudo-distributed setup on a different
> >> > machine. Both are running the exact same version (same svn revision
> and
> >> > source checksum).
> >> >
> >> > Does anyone have a solution or a suggestion on how to find more debug
> >> > information?
> >> >
> >> > Thank you in advance,
> >> > Caetano Sauer
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>

Mime
View raw message