systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deron Eriksson <deroneriks...@gmail.com>
Subject Re: Compatibility with MR1 Cloudera cdh4.2.1
Date Fri, 05 Feb 2016 18:26:11 GMT
Hi Ethan,

I believe your safest, cleanest bet is to wait for the fix from Matthias.
When he pushes the fix, you will see it at
https://github.com/apache/incubator-systemml/commits/master. At that point,
you can pull (git pull) the changes from GitHub to your machine and then
build with Maven utilizing the new changes.

Alternatively, it's not really recommended, but you might be able to use
-libjars to reference the hadoop-commons jar, which should be in your local
maven repository
(.m2/repository/org/apache/hadoop/hadoop-common/2.4.1/hadoop-common-2.4.1.jar).
However, mixing jar versions usually doesn't work very well (it can lead to
other problems), so waiting for the fix is best.

Deron


On Fri, Feb 5, 2016 at 6:47 AM, Ethan Xu <ethan.yifanxu@gmail.com> wrote:

> Thank you Shirish and Deron for the suggestions. Looking forward to the fix
> from Matthias!
>
> We are using the hadoop-common shipped with CDH4.2.1, and it's in
> classpath. I'm a bit hesitate to alter our hadoop configuration to include
> other versions since other people are using it too.
>
> Not sure if/how the following naive approach affects the program behavior,
> but I did try changing the scope of
>
> <groupId>org.apache.hadoop</groupId>
> <artifactId>hadoop-common</artifactId>
> <version>${hadoop.version}</version>
>
> in SystemML's pom.xml from 'provided' to 'compile' and rebuilt the jar
> (21MB), and it threw the same error.
>
> By the way this is in pom.xml line 65 - 72:
> <properties>
>           <hadoop.version>2.4.1</hadoop.version>
>           <antlr.version>4.3</antlr.version>
>           <spark.version>1.4.1</spark.version>
>
>                 <!-- OS-specific JVM arguments for running integration
> tests -->
>                 <integrationTestExtraJVMArgs />
> </properties>
>
> Am I supposed to modify the hadoop.version before build?
>
> Thanks again,
>
> Ethan
>
>
>
> On Fri, Feb 5, 2016 at 2:29 AM, Deron Eriksson <deroneriksson@gmail.com>
> wrote:
>
> > Hi Matthias,
> >
> > Glad to hear the fix is simple. Mixing jar versions sometimes is not very
> > fun.
> >
> > Deron
> >
> >
> > On Thu, Feb 4, 2016 at 11:10 PM, Matthias Boehm <mboehm@us.ibm.com>
> wrote:
> >
> > > well, let's not mix different hadoop versions in the class path or
> > > client/server. If I'm not mistaken, cdh 4.x always shipped with MR v1.
> > It's
> > > a trivial fix for us and will be in the repo tomorrow morning anyway.
> > > Thanks for catching this issue Ethan.
> > >
> > > Regards,
> > > Matthias
> > >
> > > [image: Inactive hide details for Deron Eriksson ---02/04/2016 11:04:38
> > > PM---Hi Ethan, Just FYI, I looked at hadoop-common-2.0.0-cdh4.2]Deron
> > > Eriksson ---02/04/2016 11:04:38 PM---Hi Ethan, Just FYI, I looked at
> > > hadoop-common-2.0.0-cdh4.2.1.jar (
> > >
> > > From: Deron Eriksson <deroneriksson@gmail.com>
> > > To: dev@systemml.incubator.apache.org
> > > Date: 02/04/2016 11:04 PM
> > > Subject: Re: Compatibility with MR1 Cloudera cdh4.2.1
> > > ------------------------------
> > >
> > >
> > >
> > > Hi Ethan,
> > >
> > > Just FYI, I looked at hadoop-common-2.0.0-cdh4.2.1.jar (
> > >
> > >
> >
> https://repository.cloudera.com/artifactory/repo/org/apache/hadoop/hadoop-common/2.0.0-cdh4.2.1/
> > > ),
> > > since I don't see a 2.0.0-mr1-cdh4.2.1 version, and the
> > > org.apache.hadoop.conf.Configuration class in that jar doesn't appear
> to
> > > have a getDouble method, so using that version of hadoop-common won't
> > work.
> > >
> > > However, the hadoop-common-2.4.1.jar (
> > >
> > >
> >
> https://repository.cloudera.com/artifactory/repo/org/apache/hadoop/hadoop-common/2.4.1/
> > > )
> > >
> > > does appear to have the getDouble method. It's possible that adding
> that
> > > jar to your classpath may fix your problem, as Shirish pointed out.
> > >
> > > It sounds like Matthias may have another fix.
> > >
> > > Deron
> > >
> > >
> > >
> > > On Thu, Feb 4, 2016 at 6:40 PM, Matthias Boehm <mboehm@us.ibm.com>
> > wrote:
> > >
> > > > well, we did indeed not run on MR v1 for a while now. However, I
> don't
> > > > want to get that far and say we don't support it anymore. I'll fix
> this
> > > > particular issue by tomorrow.
> > > >
> > > > In the next couple of weeks we should run our full performance
> > testsuite
> > > > (for broad coverage) over an MR v1 cluster and systematically remove
> > > > unnecessary incompatibility like this instance. Any volunteers?
> > > >
> > > > Regards,
> > > > Matthias
> > > >
> > > > [image: Inactive hide details for Ethan Xu ---02/04/2016 05:51:28
> > > > PM---Hello, I got an error when running the
> > > systemML/scripts/Univar-S]Ethan
> > > > Xu ---02/04/2016 05:51:28 PM---Hello, I got an error when running the
> > > > systemML/scripts/Univar-Stats.dml script on
> > > >
> > > > From: Ethan Xu <ethan.yifanxu@gmail.com>
> > > > To: dev@systemml.incubator.apache.org
> > > > Date: 02/04/2016 05:51 PM
> > > > Subject: Compatibility with MR1 Cloudera cdh4.2.1
> > > > ------------------------------
> > >
> > > >
> > > >
> > > >
> > > > Hello,
> > > >
> > > > I got an error when running the systemML/scripts/Univar-Stats.dml
> > script
> > > on
> > > > a hadoop cluster (Cloudera CDH4.2.1) on a 6GB data set. Error message
> > is
> > > at
> > > > the bottom of the email. The same script ran fine on a smaller sample
> > > > (several MB) of the same data set, when MR was not invoked.
> > > >
> > > > The main error was java.lang.NoSuchMethodError:
> > > > org.apache.hadoop.mapred.JobConf.getDouble()
> > > > Digging deeper, it looks like the CDH4.2.1 version of MR indeed
> didn't
> > > have
> > > > the JobConf.getDouble() method.
> > > >
> > > > The hadoop-core jar of CDH4.2.1 can be found here:
> > > >
> > > >
> > >
> > >
> >
> https://repository.cloudera.com/artifactory/repo/org/apache/hadoop/hadoop-core/2.0.0-mr1-cdh4.2.1/
> > >
> > > >
> > > > The calling line of SystemML is line 1194 of
> > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/runtime/matrix/mapred/MRJobConfiguration.java
> > > >
> > > > I was wondering, if the finding is accurate, is there a potential
> fix,
> > or
> > > > does this mean the current version of SystemML is not compatible with
> > > > CDH4.2.1?
> > > >
> > > > Thank you,
> > > >
> > > > Ethan
> > > >
> > > >
> > > > hadoop jar $sysDir/target/SystemML.jar -f
> > > > $sysDir/scripts/algorithms/Univar-Stats.dml -nvargs
> > > > X=$baseDirHDFS/original-coded.csv
> > > > TYPES=$baseDirHDFS/original-coded-type.csv
> > > > STATS=$baseDirHDFS/univariate-summary.csv
> > > >
> > > > 16/02/04 20:35:03 INFO api.DMLScript: BEGIN DML run 02/04/2016
> 20:35:03
> > > > 16/02/04 20:35:03 INFO api.DMLScript: HADOOP_HOME: null
> > > > 16/02/04 20:35:03 WARN conf.DMLConfig: No default SystemML config
> file
> > > > (./SystemML-config.xml) found
> > > > 16/02/04 20:35:03 WARN conf.DMLConfig: Using default settings in
> > > DMLConfig
> > > > 16/02/04 20:35:04 WARN hops.OptimizerUtils: Auto-disable
> multi-threaded
> > > > text read for 'text' and 'csv' due to thread contention on JRE < 1.8
> > > > (java.version=1.7.0_71).
> > > > SLF4J: Class path contains multiple SLF4J bindings.
> > > > SLF4J: Found binding in
> > > >
> > > >
> > >
> >
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > > > SLF4J: Found binding in
> > > >
> > > >
> > >
> >
> [jar:file:/usr/local/explorys/datagrid/lib/slf4j-jdk14-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > > > SLF4J: Found binding in
> > > >
> > > >
> > >
> >
> [jar:file:/usr/local/explorys/datagrid/lib/logback-classic-1.0.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > > > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> > > > explanation.
> > > > 16/02/04 20:35:07 INFO api.DMLScript: SystemML Statistics:
> > > > Total execution time:        0.880 sec.
> > > > Number of executed MR Jobs:    0.
> > > >
> > > > 16/02/04 20:35:07 INFO api.DMLScript: END DML run 02/04/2016 20:35:07
> > > > Exception in thread "main" java.lang.NoSuchMethodError:
> > > > org.apache.hadoop.mapred.JobConf.getDouble(Ljava/lang/String;D)D
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.sysml.runtime.matrix.mapred.MRJobConfiguration.setUpMultipleInputs(MRJobConfiguration.java:1195)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.sysml.runtime.matrix.mapred.MRJobConfiguration.setUpMultipleInputs(MRJobConfiguration.java:1129)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.sysml.runtime.matrix.CSVReblockMR.runAssignRowIDMRJob(CSVReblockMR.java:307)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.sysml.runtime.matrix.CSVReblockMR.runAssignRowIDMRJob(CSVReblockMR.java:289)
> > > >    at
> > > >
> > >
> >
> org.apache.sysml.runtime.matrix.CSVReblockMR.runJob(CSVReblockMR.java:275)
> > > >    at
> > > org.apache.sysml.lops.runtime.RunMRJobs.submitJob(RunMRJobs.java:257)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.sysml.lops.runtime.RunMRJobs.prepareAndSubmitJob(RunMRJobs.java:143)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.sysml.runtime.instructions.MRJobInstruction.processInstruction(MRJobInstruction.java:1500)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:309)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:227)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:169)
> > > >    at
> > > >
> > org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:146)
> > > >    at org.apache.sysml.api.DMLScript.execute(DMLScript.java:676)
> > > >    at
> org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:338)
> > > >    at org.apache.sysml.api.DMLScript.main(DMLScript.java:197)
> > > >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >    at
> > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > > >    at
> > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > >    at java.lang.reflect.Method.invoke(Method.java:606)
> > > >    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
>
>
>
> --
> Yifan "Ethan" Xu, PhD
>
> Data Scientist / Statistician
> Explorys, IBM Watson Health
>
> Adjunct Faculty
> Department of Epidemiology and Biostatistics
> Case Western Reserve University
>
> --------------
> Email: ethan.yifanxu@gmail.com
> Phone: (607) 760-6817
> --------------
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message