mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang Sun <soushare....@gmail.com>
Subject Re: New to mahout
Date Fri, 19 Mar 2010 21:12:27 GMT
Hi Deneche,

Just tested it. With the KDD dataset, everything works fine. When I try to
use my own dataset, the BuildForest class throws an exception

Error: null

I attached my dataset with 283 numerical features and the last column is
class label of 1 or 0.

Do you know why I got this exception?

Thanks
Yang
On Fri, Mar 19, 2010 at 12:10 AM, deneche abdelhakim <a_deneche@yahoo.fr>wrote:

> Hi Yang,
>
> The changes will be available in Mahout 0.4, but they are already
> committed, so you could just get the code from svn.
> By the way, I updated the Wiki to explain how to use the new code, take a
> look here:
> http://cwiki.apache.org/MAHOUT/partial-implementation.html
>
> Hope you find it useful
>
> Deneche
> --- En date de : Ven 19.3.10, Yang Sun <soushare.com@gmail.com> a écrit :
>
> > De: Yang Sun <soushare.com@gmail.com>
> > Objet: Re: New to mahout
> > À: mahout-user@lucene.apache.org
> > Date: Vendredi 19 mars 2010, 0h11
>  > Hi deneche,
> > I noticed that Mahout 0.3 is released. Is the random
> > forrest class ready to
> > output? Is it still called
> > org.apache.mahout.df.BreimanExample?
> >
> > Thanks,
> >
> > On Fri, Mar 12, 2010 at 3:01 AM, deneche abdelhakim <a_deneche@yahoo.fr
> >wrote:
> >
> > > Yes, there is still a lot of work to do =P
> > > As Ted said, Decision Forests classifier should
> > ultimately have a similar
> > > interface to all Mahout classifiers.
> > >
> > >
> > > > 1. how can I output the model and how can I use
> > the trained model to
> > > > predict
> > >
> > > I should commit a patch really soon (this Saturday ?)
> > that will allow you
> > > to save the trained model and use it to claffiy new
> > data
> > >
> > >
> > > > 2. There is also no option to specify number of
> > random
> > > > features for each tree. How can I adjust that
> > parameter?
> > >
> > > the -sl parameter allows you to specify the number of
> > random features the
> > > trainer will randomly select for each tree node. Is
> > this what you are
> > > looking for ?
> > >
> > >
> > > > I think there is still no enough parameters
> > options to use, at least not
> > > > enough as R's
> > >
> > > I'll love to hear any suggestion/addition you want me
> > to make. I already
> > > have a lot of features I want to add, but I could use
> > your (the users)
> > > feedback to know which feature I should start working
> > on first. =D
> > >
> > > --- En date de : Ven 12.3.10, Cui tony <tony.cui1983@gmail.com>
> > a écrit :
> > >
> > > > De: Cui tony <tony.cui1983@gmail.com>
> > > > Objet: Re: New to mahout
> > > > À: mahout-user@lucene.apache.org
> > > > Date: Vendredi 12 mars 2010, 2h43
> > >  > 1. You can check the example java
> > > > code in trunk : BuildForest.java and
> > > > TestForest.java.
> > > >
> > > > 2. I think there is still no enough parameters
> > options to
> > > > use, at least not
> > > > enough as R's
> > > >
> > > > 2010/3/12 Yang Sun <soushare.com@gmail.com>
> > > >
> > > > > Thanks for the reply. The trunk version runs
> > without
> > > > any problem. I still
> > > > > have a couple questions about the method.
> > > > >
> > > > > 1. how can I output the model and how can I
> > use the
> > > > trained model to
> > > > > predict
> > > > > classes of new data? I saw the options of
> > the class:
> > > > >
> > > > > Options
> > > > >  --data (-d) path
> > > >            Data
> > path
> > > > >  --dataset (-ds) dataset
> > > >       Dataset path
> > > > >  --iterations (-i) numIterations
> > > > Number of times to repeat the test
> > > > >  --nbtrees (-t) nbtrees
> > > >        Number of trees to
> > grow,
> > > > each iteration
> > > > >  --help (-h)
> > > >
> >    Print out
> > > > help
> > > > >
> > > > > It seems no option for specifying output
> > directory on
> > > > Hadoop.
> > > > >
> > > > > 2. There is also no option to specify number
> > of random
> > > > features for each
> > > > > tree. How can I adjust that parameter?
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Thu, Mar 11, 2010 at 10:29 AM, Ted
> > Dunning <ted.dunning@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Try using the trunk version.  We
> > are about
> > > > to release 0.3 and it has
> > > > > > significant improvements.
> > > > > >
> > > > > > On Thu, Mar 11, 2010 at 9:40 AM, Yang
> > Sun <soushare.com@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > It's the first time I try to use
> > mahout. But
> > > > the Breiman example gave
> > > > > me
> > > > > > > the
> > > > > > > following exception:
> > > > > > >
> > > > > > > [localhost]$ hadoop jar
> > > > examples/target/mahout-examples-0.2.job
> > > > > > >
> > org.apache.mahout.df.BreimanExample -d
> > > > test_data/glass.data -ds
> > > > > > test_data/
> > > > > > > glass.info -i 10 -t 100
> > > > > > > 10/03/11 09:26:07 INFO
> > df.BreimanExample:
> > > > Iteration 0
> > > > > > > 10/03/11 09:26:07 INFO
> > df.BreimanExample:
> > > > Growing a forest with m=4
> > > > > > > 10/03/11 09:26:07 INFO
> > > > ref.SequentialBuilder: Building 10%
> > > > > > > 10/03/11 09:26:07 INFO
> > > > ref.SequentialBuilder: Building 20%
> > > > > > > 10/03/11 09:26:07 INFO
> > > > ref.SequentialBuilder: Building 30%
> > > > > > > 10/03/11 09:26:07 INFO
> > > > ref.SequentialBuilder: Building 40%
> > > > > > > 10/03/11 09:26:07 INFO
> > > > ref.SequentialBuilder: Building 50%
> > > > > > > 10/03/11 09:26:07 INFO
> > > > ref.SequentialBuilder: Building 60%
> > > > > > > 10/03/11 09:26:07 INFO
> > > > ref.SequentialBuilder: Building 70%
> > > > > > > 10/03/11 09:26:07 INFO
> > > > ref.SequentialBuilder: Building 80%
> > > > > > > 10/03/11 09:26:07 INFO
> > > > ref.SequentialBuilder: Building 90%
> > > > > > > 10/03/11 09:26:07 INFO
> > > > ref.SequentialBuilder: Building 100%
> > > > > > > Exception in thread "main"
> > > > java.lang.IllegalArgumentException:
> > > > > > > labels.length
> > > > > > > != predictions.length
> > > > > > >        at
> > > > > > >
> > > >
> > org.apache.mahout.df.ErrorEstimate.errorRate(ErrorEstimate.java:29)
> > > > > > >        at
> > > > > > >
> > > > >
> > > >
> > org.apache.mahout.df.BreimanExample.runIteration(BreimanExample.java:108)
> > > > > > >        at
> > > > > >
> > > >
> > org.apache.mahout.df.BreimanExample.run(BreimanExample.java:214)
> > > > > > >        at
> > > >
> > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > > > > >        at
> > > > > >
> > > >
> > org.apache.mahout.df.BreimanExample.main(BreimanExample.java:143)
> > > > > > >        at
> > > >
> > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > > > > >        at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > > > > >        at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > > > > >        at
> > > > java.lang.reflect.Method.invoke(Method.java:597)
> > > > > > >        at
> > > >
> > org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > > > > >
> > > > > > > Can any one help me get through
> > it?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Yang
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > >
> >
>
>
>
>

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message