crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Sparks <jspa...@cray.com>
Subject Re: Yarnchild error : crunch-0.7.0
Date Wed, 26 Feb 2014 22:46:22 GMT
Well to try and close the loop on this thread. I went back to first principles, download the
0.7.0 example code and built it against hadoop-2.0.6-alpha and used the -Dcrunch.platform=2
option to build. I've launched the job jar (with dependencies) and get the following error.

2014-02-26 16:32:50,468 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid
= 268435456
2014-02-26 16:32:50,468 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 67108860;
length = 16777216
2014-02-26 16:32:50,520 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running
child : org.apache.crunch.CrunchRuntimeException: Could not read runtime node information
        at org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:48)
        at org.apache.crunch.impl.mr.run.CrunchMapper.setup(CrunchMapper.java:37)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)


If I look at the code it's trying to read the crunch.tmp.dir configuration and failing. We
are on a Cray … so we have a little different HDFS structure (sorry about that). Currently
this is our HDFS structure.

+ hdfs dfs -ls -R /
drwxrwxrwx   - jsparks supergroup          0 2014-02-26 16:32 /tmp
drwxrwx---   - jsparks supergroup          0 2014-02-26 16:32 /tmp/hadoop-yarn
drwxrwx---   - jsparks supergroup          0 2014-02-26 16:32 /tmp/hadoop-yarn/staging
drwxrwx---   - jsparks supergroup          0 2014-02-26 16:32 /tmp/hadoop-yarn/staging/history
drwxrwx---   - jsparks supergroup          0 2014-02-26 16:32 /tmp/hadoop-yarn/staging/history/done
drwxrwxrwt   - jsparks supergroup          0 2014-02-26 16:32 /tmp/hadoop-yarn/staging/history/done_intermediate
drwxr-xr-x   - jsparks supergroup          0 2014-02-26 16:32 /user
drwxr-xr-x   - jsparks supergroup          0 2014-02-26 16:32 /user/jsparks
-rw-r--r--   1 jsparks supergroup     610157 2014-02-26 16:32 /user/jsparks/HuckleberryFinn.txt

And yes, we are reading Huck Finn …

--
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

From: Josh Wills <jwills@cloudera.com<mailto:jwills@cloudera.com>>
Reply-To: "user@crunch.apache.org<mailto:user@crunch.apache.org>" <user@crunch.apache.org<mailto:user@crunch.apache.org>>
Date: Tuesday, February 25, 2014 4:19 PM
To: "user@crunch.apache.org<mailto:user@crunch.apache.org>" <user@crunch.apache.org<mailto:user@crunch.apache.org>>
Subject: Re: Yarnchild error : crunch-0.7.0

The first error looks like a weird serialization error, like as if the Crunch version that
was being used on the cluster was different from the one that was used to compile the client.
Is crunch installed on the cluster, or is there another version of Crunch in the hadoop classpath?

The second one still looks to me like the hadoop1/hadoop2 incompatibility issue, like the
local client was compiled with hadoop1 APIs instead of the hadoop2 APIs on the cluster.

There's an 0.7.0-hadoop2 maven target that should have the right API profile--
http://mvnrepository.com/artifact/org.apache.crunch/crunch-core/0.7.0-hadoop2

I know that we made an error in the 0.8.0 release w/the hadoop2 versioning, so 0.8.0-hadoop2
doesn't work, but 0.8.1-hadoop2 or 0.8.2-hadoop2 should also work.



On Tue, Feb 25, 2014 at 1:54 PM, Bill Sparks <jsparks@cray.com<mailto:jsparks@cray.com>>
wrote:
So interesting … same results.

This time I ran two versions 1) the examples from the crunch build and the other 2) a standalone
application. The result for the standalone was the same as before  - I guess I expected that.
The other failure was different and a little more confusing. I guess the question I have is
can this be caused by the JDK used to build crunch. We are using JDK1.7

Failure 1)

2014-02-25 14:59:04,252 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running
child : org.apache.crunch.CrunchRuntimeException: Could not read runtime node information
at org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:48)
at org.apache.crunch.impl.mr.run.CrunchMapper.setup(CrunchMapper.java:37)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(AccessController.java:366)
at javax.security.auth.Subject.doAs(Subject.java:572)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
Caused by: java.io.InvalidClassException: org.apache.crunch.types.writable.Writables$4; local
class incompatible: stream classdesc serialVersionUID = 5855040850180329703, local class serialVersionUID
= 4130080921736307351

Failure 2)
2014-02-25 14:59:33,926 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child
: java.lang.IncompatibleClassChangeError: org/apache/hadoop/mapreduce/JobContext.getConfiguration()Lorg/apache/hadoop/conf/Configuration;
at org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:42)
at org.apache.crunch.impl.mr.run.CrunchMapper.setup(CrunchMapper.java:37)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(AccessController.java:366)
at javax.security.auth.Subject.doAs(Subject.java:572)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)


JDK
jsparks@jupiter:/lus/dal/jsparks/example/tmp/hdlogs.jsparks/userlogs> java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

--
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

From: Josh Wills <josh.wills@gmail.com<mailto:josh.wills@gmail.com>>
Reply-To: "user@crunch.apache.org<mailto:user@crunch.apache.org>" <user@crunch.apache.org<mailto:user@crunch.apache.org>>
Date: Tuesday, February 25, 2014 1:54 PM

To: "user@crunch.apache.org<mailto:user@crunch.apache.org>" <user@crunch.apache.org<mailto:user@crunch.apache.org>>
Subject: Re: Yarnchild error : crunch-0.7.0

Yeah, try it again w/ -Dcrunch.platform=2 instead of -Dhadoop.profile=2.0

J


On Tue, Feb 25, 2014 at 11:47 AM, Bill Sparks <jsparks@cray.com<mailto:jsparks@cray.com>>
wrote:
Well I did the following and also changed the pom.xml to reference the correct hadoop version.

$ mvn clean install -Dhadoop.profile=2.0 –DskipTests

<hadoop.version>2.0.6-alpha</hadoop.version>

--
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

From: Josh Wills <jwills@cloudera.com<mailto:jwills@cloudera.com>>
Reply-To: "user@crunch.apache.org<mailto:user@crunch.apache.org>" <user@crunch.apache.org<mailto:user@crunch.apache.org>>
Date: Tuesday, February 25, 2014 1:43 PM
To: "user@crunch.apache.org<mailto:user@crunch.apache.org>" <user@crunch.apache.org<mailto:user@crunch.apache.org>>
Subject: Re: Yarnchild error : crunch-0.7.0

Hrm-- that's usually related to the API changes between hadoop1 and hadoop2. How did you build
crunch, exactly? Did you use -Dcrunch.platform=2?

J


On Tue, Feb 25, 2014 at 11:37 AM, Bill Sparks <jsparks@cray.com<mailto:jsparks@cray.com>>
wrote:
Can anyone shed some light on why I would be getting the following error when submitting a
simple crunch wordcount example. Other Hadoop MR applications work, just it seems that Crunch
is confused about some class definitions.

I'm running hadoop-2.0.6-alpha and have build crunch to match.

Hadoop 2.0.6-alpha
Subversion Unknown -r ca4c88898f95aaab3fd85b5e9c194ffd647c2109
Compiled by jenkins on 2013-10-30T07:19Z
>From source with checksum 95e88b2a9589fa69d6d5c1dbd48d4e


2014-02-25 13:23:00,049 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid
= 268435456
2014-02-25 13:23:00,049 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 67108860;
length = 16777216
2014-02-25 13:23:00,070 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child
: java.lang.IncompatibleClassChangeError: org/apache/hadoop/mapreduce/JobContext.getConfiguration()Lorg/apache/hadoop/conf/Configuration;
at org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:42)
at org.apache.crunch.impl.mr.run.CrunchMapper.setup(CrunchMapper.java:37)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(AccessController.java:366)
at javax.security.auth.Subject.doAs(Subject.java:572)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)




--
Director of Data Science
Cloudera<http://www.cloudera.com>
Twitter: @josh_wills<http://twitter.com/josh_wills>




--
Director of Data Science
Cloudera<http://www.cloudera.com>
Twitter: @josh_wills<http://twitter.com/josh_wills>

Mime
View raw message