mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clément Notin <clement.no...@gmail.com>
Subject Re: How to launch an Hadoop Recommender Job from Java ?
Date Wed, 10 Aug 2011 13:33:40 GMT
Just to be clear, is it possible to run an hadoop job from a machine outside
of the cluster (what I'm trying to do) ? I'm wondering now...
I think I will run this on one of the cluster's machines.

2011/8/10 Sean Owen <srowen@gmail.com>

> You should not need to configure any classpath. Use the "job" files which
> contains all dependencies. You can run it locally or in a cluster.
>
> 2011/8/10 Clément Notin <clement.notin@gmail.com>
>
> > Ho I'm sorry I thought it wasn't running on HDFS because of the local
> > /tmp/... folder. You're right thanks !
> >
> > But (yes I know...) it's using LocalJobRunner (in the log msg) so I
> assume
> > it doesn't run on the cluster. I have in the classpath the "masters" and
> > "slaves" files, and also core-site.xml and hdfs-site.xml so it should run
> > there nop ?
> >
> > Thanks for you help !
> >
> > 2011/8/10 Sean Owen <srowen@gmail.com>
> >
> > > I don't believe it's actually cleaned out then. Hadoop thinks the temp
> > > directory exists from a previous run, which perhaps failed. Make sure
> it
> > is
> > > deleted in HDFS. This is, at least, what the error is trying to tell
> you.
> > > Are you running two jobs that might both want this directory?
> > >
> > > 2011/8/10 Clément Notin <clement.notin@gmail.com>
> > >
> > > > Yes I agree it's ugly ;)
> > > >
> > > > I tried with the params
> > > > "org.apache.mahout.cf.taste.hadoop.item.RecommenderJob
> > > > -Dmapred.input.dir=mb-recouser-input/input.csv
> > > > -Dmapred.output.dir=mb-recouser-output/reco.csv --numRecommendations
> 3
> > > > --booleanData true --similarityClassname
> SIMILARITY_EUCLIDEAN_DISTANCE"
> > > (of
> > > > course I split them).
> > > >
> > > > But I'm getting an error :
> > > >  INFO [2011-08-10 14:52:05,195] (JobClient.java:871) - Cleaning up
> the
> > > > staging area
> > > >
> > > >
> > >
> >
> file:/tmp/hadoop-clement/mapred/staging/clement1957523084/.staging/job_local_0001
> > > > org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
> > > > temp/itemIDIndex already exists
> > > >
> > > > Even if I clean before the /tmp/hadoop-clement/ folder...
> > > > And it don't seems to run on the cluster.
> > > >
> > > > 2011/8/10 Sean Owen <srowen@gmail.com>
> > > >
> > > > > You could just run the main() method with an array of the same
> > > arguments
> > > > > you
> > > > > passed on the command line. It's a little ugly but entirely works.
> > > > >
> > > > > 2011/8/10 Clément Notin <clement.notin@gmail.com>
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I've achieved to run a recommender over hadoop using the command
> > line
> > > > > > /bin/mahout org.apache.mahout.cf.taste.hadoop.item.RecommenderJob
> > > > --input
> > > > > > .....
> > > > > > I'm happy with it but now I want to launch this using Java.
> > > > > >
> > > > > > What is the easiest way to do this ? I tried to run the
> > MahoutDriver
> > > > but
> > > > > it
> > > > > > runs locally however I want to launch the job on an hadoop
> cluster.
> > > > > >
> > > > > > Regards.
> > > > > >
> > > > > > --
> > > > > > *Clément **Notin*
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Clément **Notin*
> > > >
> > >
> > > --
> > > *Clément **Notin*
> > >  <http://fr.linkedin.com/in/clementnotin>
> > >
> >
>



-- 
*Clément **Notin*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message