mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: bug when generating sparse vector
Date Tue, 06 Sep 2011 03:36:07 GMT
"InstantiationException" probably means that it cannot create an object from
this class. I've had some problems that caused this:
1) the class depends on other classes which are not in the classpath.
2) there is no zero-arg constructor.
3) it was not a public class.

There should be some more error messages. At the bottom of SparseVectors...
the catch block prints the stack you see. You might try logging
'e.getCause()' and e.getMessage().

And now, from looking at the code: you are trying to use the Lucene trunk.
The SparseVectorsFromSequanceFiles (SVFSF) code needs a fix to support the
trunk.

SVFSF tries to make a new instance of the class from a no-arg constructor,
which the class does not have. The nearest equivalent is a constructor that
includes a Lucene index version number. Almost all of the Analyzer classes
have the same problem. In other words, it is impossible to use the Lucene
trunk with SVFSF.

A Lucene expert could change SparseVectors to handle this case. (There might
be other problems.)

Please file a JIRA to upgrade (or support) the Lucene trunk.

Lance

On Mon, Sep 5, 2011 at 6:56 PM, Walter Chang <weidezhang2007@gmail.com>wrote:

> Hi
>
> ./bin/mahout seq2sparse  -i ../../socialtvdata/socialtv-seqfiles -o
> ../../socialtvdata/socialtv-vectors -a
> org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer
>
> I'm using Lucene's Chinese Analyzer to calculate the tokens(tf-idf scores)
> for clustering purposes. It seems it has running issue (see following) .
> i'm
> using the contributed pkg of lucene 3.3. I haven't spent time looking into
> details. If any one see this before and fixed it, that will be great help
> to
> me.  Thanks a lot,
>
> Weide
>
> Exception in thread "main" java.lang.InstantiationException:
> org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer
> at java.lang.Class.newInstance0(Class.java:340)
> at java.lang.Class.newInstance(Class.java:308)
> at
>
> org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:198)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at
>
> org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:52)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
>



-- 
Lance Norskog
goksron@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message