hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Using Mahout 1.0-SNAPSHOT with yarn cluster
Date Fri, 09 Jan 2015 15:23:57 GMT
Pardon me, setJarByClass() should point to your class.
You can pass comma separated list of jars (including mahout jar) this way:

conf.set("tmpjars", list-of-jars);

Cheers

On Fri, Jan 9, 2015 at 7:17 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> In your driver, you can call:
>
>     job.setJarByClass(theclass.class);
>
> where theclass can be one of the classes in mahout jar.
>
> FYI
>
> On Fri, Jan 9, 2015 at 1:30 AM, mw <mw@plista.com> wrote:
>
>>  Hello,
>>
>> i am working on a web application that should execute lda on a external
>> yarn cluster.
>>
>> I am uploading all the relevant sequence files onto the yarn cluter.
>> This is how it try to remotely execute lda on the cluster.
>>
>>         try {
>>             ugi.doAs(new PrivilegedExceptionAction<Void>() {
>>                 public Void run() throws Exception {
>>                     Configuration hdoopConf = new Configuration();
>>                     hdoopConf.set("fs.defaultFS",
>> "hdfs://xxx.xxx.xxx.xxx:9000/user/xx");
>>                     hdoopConf.set("yarn.resourcemanager.hostname",
>> "xxx.xxx.xxx.xxx");
>>                     hdoopConf.set("mapreduce.framework.name", "yarn");
>>                     hdoopConf.set("mapred.framework.name", "yarn");
>>                     hdoopConf.set("mapred.job.tracker",
>> "xxx.xxx.xxx.xxx");
>>                     hdoopConf.set("dfs.permissions.enabled", "false");
>>                     hdoopConf.set("hadoop.job.ugi", "xx");
>> hdoopConf.set("mapreduce.jobhistory.address","xxx.xxx.xxx.xxx:10020" );
>>                     CVB0Driver driver = new CVB0Driver();
>>                     try {
>>                         driver.run(hdoopConf,
>> sparseVectorIn.suffix("/matrix"),
>>                                 topicsOut, k, numTerms,
>> doc_topic_smoothening, term_topic_smoothening,
>>                                 maxIter, iteration_block_size,
>> convergenceDelta,
>> sparseVectorIn.suffix("/dictionary.file-0"), topicsOut.suffix("
>> */DocumentTopics/*"), sparseVectorIn,
>>                                 seed, testFraction, numTrainThreads,
>> numUpdateThreads, maxItersPerDoc,
>>                                 numReduceTasks, backfillPerplexity);
>>                     } catch (ClassNotFoundException e) {
>>                         e.printStackTrace();
>>                     } catch (InterruptedException e) {
>>                         e.printStackTrace();
>>                     }
>>                     return null;
>>                 }
>>             });
>>         } catch (InterruptedException e) {
>>             e.printStackTrace();
>>         }
>>
>> I am getting the following error message:
>>
>> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>     at java.lang.Class.forName0(Native Method)
>>     at java.lang.Class.forName(Class.java:344)
>>     at
>> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
>>     at
>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
>>     at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
>>     at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
>>     at
>> org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
>>     at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
>>     at
>> org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
>>     at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
>>     at
>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
>>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at javax.security.auth.Subject.doAs(Subject.java:422)
>>     at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>>
>> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>     at java.lang.Class.forName0(Native Method)
>>     at java.lang.Class.forName(Class.java:344)
>>     at
>> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
>>     at
>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
>>     at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
>>     at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
>>     at
>> org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
>>     at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
>>     at
>> org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
>>     at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
>>     at
>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
>>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at javax.security.auth.Subject.doAs(Subject.java:422)
>>     at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>>
>> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>     at java.lang.Class.forName0(Native Method)
>>     at java.lang.Class.forName(Class.java:344)
>>     at
>> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
>>     at
>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
>>     at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
>>     at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
>>     at
>> org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
>>     at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
>>     at
>> org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
>>     at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
>>     at
>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
>>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at javax.security.auth.Subject.doAs(Subject.java:422)
>>     at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>>
>> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>     at java.lang.Class.forName0(Native Method)
>>     at java.lang.Class.forName(Class.java:344)
>>     at
>> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
>>     at
>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
>>     at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
>>     at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
>>     at
>> org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
>>     at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
>>     at
>> org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
>>     at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
>>     at
>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
>>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at javax.security.auth.Subject.doAs(Subject.java:422)
>>     at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>>
>> java.lang.InterruptedException: Failed to complete iteration 1 stage 1
>>     at
>> org.apache.mahout.clustering.lda.cvb.CVB0Driver.runIteration(CVB0Driver.java:502)
>>     at
>> org.apache.mahout.clustering.lda.cvb.CVB0Driver.run(CVB0Driver.java:319)
>>     ...
>>
>> I was told on the user-mahout mailing list that the lda jar mahout sents
>> to yarn should contain all relevant classes. Any idea why this error is
>> thrown nevertheless?
>>
>> Best,
>> Max
>>
>
>

Mime
View raw message