mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Camilo Lopez <cam...@camilolopez.com>
Subject Re: Custom analyzers for seq2sparse
Date Wed, 20 Apr 2011 17:25:47 GMT
Ian,

Using 3.0.x ( the one that comes by default in Mahouts trunk now),
by nullary consstructor you mean I should overload the constructor to receive 
no args in my own custom class?


On 2011-04-20, at 1:23 PM, Ian Helmke wrote:

> What version of lucene are you using? If you use lucene 3.0 or later,
> you can't use StandardAnalyzer as-is because it has no no-args
> constructor. You could try the mahout DefaultAnalyzer (which wraps the
> lucene analyzer in a no-argument constructor). I have gotten custom
> analyzers to work, but they need to have a nullary constructor.
> 
> 
> On Wed, Apr 20, 2011 at 12:58 PM, Camilo Lopez <camilo@camilolopez.com> wrote:
>> Hi List,
>> 
>> Trying to run custom analizer classes I'm always getting InstantiationException,
at first I suspected my own code, but trying with what is supposed to be the default value
'org.apache.lucene.analysis.standard.StandardAnalyzer' I still get the same exception.
>> 
>> This is the command
>> 
>> bin/mahout seq2sparse  -i /htmless_articles_seq -o /htmless_articles_vectors_1 -ng
3 -x35 -wt tfidf -a org.apache.lucene.analysis.standard.StandardAnalyzer  -nv
>> 
>> 
>> Looking a little deeper (ie catching the InstantiationException and throwing getCause())
 InstantiationException in turns out the problem is caused by a NullPointerException
>> 
>> Exception in thread "main" java.lang.NullPointerException
>>        at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:211)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>        at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:52)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:187)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> 
>> 
>> Am I missing something, is there another way to create/use custom analyzers in seq2sparse?
>> 
>> 
>> 


Mime
View raw message