mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
Date Thu, 10 Jun 2010 17:47:11 GMT
On Thu, Jun 10, 2010 at 10:28 AM, Kris Jack <mrkrisjack@gmail.com> wrote:
>
> Thanks very much for the help.  I looked into the problem a little deeper
> and found that the org.apache.mahout.utils.vectors.lucene.Driver was
> writing
> out LongWriters instead of IntWriters so I just changed the code in there.
> Should this code be using IntWriters or LongWriters?
>

The reason why the Lucene Driver uses long is that Solr encodes uid's as
long.  Kinda backwards, that Mahout wants ints, and Solr wants longs, but
that's the way it is.

Maybe the lucene Driver could take a boolean flag on whether to encode
the keys as long or int?  Anyone have opinions on this?


> After writing the to a sequence file and running your matrix transposition
> and multiplication, I get an output called part-0000.  If I read it using $
> mahout seqdumper --seqFile part-00000 then it outputs:
>

I would use "mahout vectordump" instead of "mahout seqdumper" and
you'll get nicer output.

  -jake

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message