mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vijay Santhanam <vijay.santha...@gmail.com>
Subject Re: 20news
Date Mon, 04 Jul 2011 10:18:38 GMT
I tried deleting all the folders from the test and train data except for
alt.atheism, but I get the identical error.

I might try debugging the problem in eclipse rather than from commandline,
but Eclipse doesn't quite want to work either.


On Mon, Jul 4, 2011 at 8:02 PM, Vijay Santhanam
<vijay.santhanam@gmail.com>wrote:

> Thanks anyway Sergey. Could you perhaps upload your bayes-model folder so I
> could try that out?
>
>
>
> On Mon, Jul 4, 2011 at 7:57 PM, Sergey Bartunov <sbos.net@gmail.com>wrote:
>
>> Well, that's strange. Sorry, I can't help you at the moment, maybe
>> someone else in the mailing list could.
>>
>> On 4 July 2011 13:49, Vijay Santhanam <vijay.santhanam@gmail.com> wrote:
>> > Hi Sergey,
>> >
>> > Yes, there were no errors.
>> >
>> > And all the model data seems to have been populated into bayes-model
>> folder.
>> > Also, each main folder in bayes-model has a _SUCESS file.
>> >
>> > See the tarball of my trained model here,
>> > http://dl.dropbox.com/u/7881451/bayes-model.tar.gz
>> > Please compare it to your trained model if possible, I would like to
>> know if
>> > it's different in any way.
>> >
>> > Perhaps it's corrupted in someway.
>> >
>> > Thanks,
>> > Vijay
>> >
>> >
>> >
>> > On Mon, Jul 4, 2011 at 7:39 PM, Sergey Bartunov <sbos.net@gmail.com>
>> wrote:
>> >
>> >> Stop, did you _train_ the classifier successfully before running the
>> >> _test_?
>> >>
>> >> On 4 July 2011 13:30, Vijay Santhanam <vijay.santhanam@gmail.com>
>> wrote:
>> >> > Hi Sergey,
>> >> >
>> >> > I've tried using both the sh script file and following the
>> instructions
>> >> at
>> >> > https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html - like you
>> >> suggested.
>> >> > Both return the same results.
>> >> >
>> >> > I've uploaded my bayes-test-input folder to dropbox, the first file
>> is
>> >> > here...
>> >> > http://dl.dropbox.com/u/7881451/bayes-test-input/alt.atheism.txt
>> >> >
>> >> > Thanks,
>> >> > Vijay
>> >> >
>> >> > On Mon, Jul 4, 2011 at 7:23 PM, Sergey Bartunov <sbos.net@gmail.com>
>> >> wrote:
>> >> >
>> >> >> Paste somewhere your  bayes-test-input file.
>> >> >>
>> >> >> On 4 July 2011 13:20, Sergey Bartunov <sbos.net@gmail.com>
wrote:
>> >> >> > Yes, I worked WITH hadoop, but there should be no difference.
>> >> >> >
>> >> >> > Why do you use examples/bin/build/20news-bayes.sh instead
of
>> direct
>> >> >> > running bin/mahout? Is it the same?
>> >> >> >
>> >> >> > On 4 July 2011 13:12, Vijay Santhanam <vijay.santhanam@gmail.com>
>> >> wrote:
>> >> >> >> Thanks Sergey,
>> >> >> >>
>> >> >> >> I'm still receiving the same error after following those
steps.
>> >> >> >> I've chosen not to use hadoop - does yours work WITH hadoop?
>> >> >> >>
>> >> >> >> A few bits of info that might be relevant.
>> >> >> >>
>> >> >> >> My examples/bin/work folder contains the expected folders
from
>> test
>> >> data
>> >> >> >> preparation and training...
>> >> >> >> drwxr-xr-x@ 22 Vijay  staff  748 18 Mar  2003 20news-bydate-test
>> >> >> >> drwxr-xr-x@ 22 Vijay  staff  748 18 Mar  2003
>> 20news-bydate-train
>> >> >> >> drwxr-xr-x   3 Vijay  staff  102  4 Jul 19:03 bayes-model
>> >> >> >> drwxr-xr-x  22 Vijay  staff  748  4 Jul 18:20 bayes-test-input
>> >> >> >> drwxr-xr-x  22 Vijay  staff  748  4 Jul 17:49 bayes-train-input
>> >> >> >>
>> >> >> >>
>> >> >> >> I appreciate your help, do you have any other suggestions?
>> >> >> >>
>> >> >> >> Regards,
>> >> >> >> Vijay
>> >> >> >>
>> >> >> >>
>> >> >> >> On Mon, Jul 4, 2011 at 6:58 PM, Sergey Bartunov <sbos.net@
>> gmail.com>
>> >> >> wrote:
>> >> >> >>
>> >> >> >>> When I started with Mahout I had the same errors.
In my case, I
>> just
>> >> >> >>> didn't run PrepareTwentyNewsgroups. You may try to
accurately
>> repeat
>> >> >> >>> all steps from
>> >> https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html
>> >> >> >>>
>> >> >> >>> On 4 July 2011 12:52, Vijay Santhanam <
>> vijay.santhanam@gmail.com>
>> >> >> wrote:
>> >> >> >>> > Hi All,
>> >> >> >>> >
>> >> >> >>> > I'm new to Mahout and I'm interested in experimenting
with
>> it's
>> >> >> >>> classifiers.
>> >> >> >>> >
>> >> >> >>> > Right now, I'm just trying to get up and running
with the
>> demo's
>> >> and
>> >> >> >>> > examples.
>> >> >> >>> >
>> >> >> >>> > After checking out the mahout trunk, I've tried
running the
>> >> >> >>> classification
>> >> >> >>> > example 20news, but after running the
>> >> >> >>> ./examples/bin/build/20news-bayes.sh
>> >> >> >>> > script I get the following error during the classification
>> phase.
>> >> >> >>> >
>> >> >> >>> > Does anyone else get the same thing? Or have
any
>> recommendations
>> >> >> about
>> >> >> >>> how
>> >> >> >>> > to fix it?
>> >> >> >>> > I'd just like to get a sample classifier working
before I
>> embark
>> >> on
>> >> >> my
>> >> >> >>> own
>> >> >> >>> > classification journey.
>> >> >> >>> >
>> >> >> >>> >
>> >> >> >>> > INFO: Loading model from:
>> >> >> >>> > {basePath=examples/bin/work/20news-bydate/bayes-model,
>> >> >> >>> classifierType=bayes,
>> >> >> >>> > alpha_i=1.0, dataSource=hdfs, gramSize=1, verbose=false,
>> >> >> encoding=UTF-8,
>> >> >> >>> > defaultCat=unknown,
>> >> >> >>> > testDirPath=examples/bin/work/20news-bydate/bayes-test-input}
>> >> >> >>> > Jul 4, 2011 6:28:25 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: Testing Bayes Classifier
>> >> >> >>> > Jul 4, 2011 6:28:27 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: Read 50000 feature weights
>> >> >> >>> > Jul 4, 2011 6:28:27 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: Read 100000 feature weights
>> >> >> >>> > Jul 4, 2011 6:28:28 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: 193370.88331085522
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: rec.sport.baseball -129829.34738930278
531784.7805631821
>> >> >> >>> > -0.2441388925268003
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: sci.crypt -193023.42370049533 531784.7805631821
>> >> >> -0.3629728242618669
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: rec.sport.hockey -167853.6159738822 531784.7805631821
>> >> >> >>> > -0.31564200802459647
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: talk.politics.guns -203524.0148974065 531784.7805631821
>> >> >> >>> > -0.3827187658170024
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: soc.religion.christian -163900.9258713857
>> 531784.7805631821
>> >> >> >>> > -0.308209132457322
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: sci.electronics -142854.1677345925 531784.7805631821
>> >> >> >>> > -0.26863154598614886
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: comp.os.ms-windows.misc -531784.7805631821
>> 531784.7805631821
>> >> >> -1.0
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: misc.forsale -143454.70176448982 531784.7805631821
>> >> >> >>> > -0.26976082619845826
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: talk.religion.misc -139428.73484148504
531784.7805631821
>> >> >> >>> > -0.2621901565024562
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: alt.atheism -139569.06867597546 531784.7805631821
>> >> >> >>> -0.2624540486626301
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: comp.windows.x -178029.10523376046 531784.7805631821
>> >> >> >>> > -0.33477660839638973
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: talk.politics.mideast -193075.00789450994
>> 531784.7805631821
>> >> >> >>> > -0.36306982627452317
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: comp.sys.ibm.pc.hardware -138410.02049984262
>> >> 531784.7805631821
>> >> >> >>> > -0.2602745049477736
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: comp.sys.mac.hardware -125200.9927438868
>> 531784.7805631821
>> >> >> >>> > -0.23543545682389364
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: sci.space -192437.0009266271 531784.7805631821
>> >> >> -0.3618700797018455
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: rec.motorcycles -143142.20855440624 531784.7805631821
>> >> >> >>> > -0.26917319522159455
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: rec.autos -141800.97549909537 531784.7805631821
>> >> >> -0.2666510601317365
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: comp.graphics -166882.18654471825 531784.7805631821
>> >> >> >>> > -0.3138152738556811
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: talk.politics.misc -165196.84193278523
531784.7805631821
>> >> >> >>> > -0.3106460507535303
>> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter
info
>> >> >> >>> > INFO: sci.med -192698.5183245711 531784.7805631821
>> >> >> -0.36236185270382393
>> >> >> >>> > Exception in thread "main" java.lang.IllegalArgumentException:
>> >> Label
>> >> >> not
>> >> >> >>> > found: alt.atheism from
>> >> >> >>> >  at
>> >> >> >>> >
>> >> >>
>> >>
>> com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>> >> >> >>> > at
>> >> >> >>> >
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.mahout.classifier.ConfusionMatrix.getCount(ConfusionMatrix.java:93)
>> >> >> >>> >  at
>> >> >> >>> >
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:113)
>> >> >> >>> > at
>> >> >> >>> >
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:117)
>> >> >> >>> >  at
>> >> >> >>> >
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.mahout.classifier.ConfusionMatrix.addInstance(ConfusionMatrix.java:85)
>> >> >> >>> > at
>> >> >> >>> >
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.mahout.classifier.ResultAnalyzer.addInstance(ResultAnalyzer.java:67)
>> >> >> >>> >  at
>> >> >> >>> >
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.mahout.classifier.bayes.TestClassifier.classifySequential(TestClassifier.java:244)
>> >> >> >>> > at
>> >> >> >>> >
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.mahout.classifier.bayes.TestClassifier.main(TestClassifier.java:177)
>> >> >> >>> >  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>> >> >> >>> > at
>> >> >> >>> >
>> >> >> >>>
>> >> >>
>> >>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> >> >> >>> >  at
>> >> >> >>> >
>> >> >> >>>
>> >> >>
>> >>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >> >> >>> > at java.lang.reflect.Method.invoke(Method.java:597)
>> >> >> >>> >  at
>> >> >> >>> >
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>> >> >> >>> > at
>> >> >> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>> >> >> >>> >  at
>> >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
>> >> >> >>> >
>> >> >> >>> >
>> >> >> >>> > Any help is great appreciated.
>> >> >> >>> >
>> >> >> >>> > Regards,
>> >> >> >>> > --
>> >> >> >>> >  Vijay Santhanam
>> >> >> >>> >  Software Engineer
>> >> >> >>> >
>> >> >> >>>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >>  Vijay Santhanam
>> >> >> >>  Software Engineer
>> >> >> >>  http://au.linkedin.com/in/vijaysanthanam
>> >> >> >>  0407525087
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> >  Vijay Santhanam
>> >> >  Software Engineer
>> >> >  http://au.linkedin.com/in/vijaysanthanam
>> >> >  0407525087
>> >> >
>> >>
>> >
>> >
>> >
>> > --
>> >  Vijay Santhanam
>> >  Software Engineer
>> >  http://au.linkedin.com/in/vijaysanthanam
>> >  0407525087
>> >
>>
>
>
>
> --
>  Vijay Santhanam
>  Software Engineer
>  http://au.linkedin.com/in/vijaysanthanam
>  0407525087
>



-- 
 Vijay Santhanam
 Software Engineer
 http://au.linkedin.com/in/vijaysanthanam
 0407525087

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message