mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Josal <r...@josal.com>
Subject SequenceFileVectorWriter key class
Date Mon, 25 Mar 2013 19:55:11 GMT
Hi all,

  In looking for a solution to the type mismatch between the output of
lucene.vector and the input of cvb lda, I found
org.apache.mahout.utils.vectors.io.SequenceFileVectorWriter in the
mahout integration source assumes the SequenceFile.Writer object it
takes as a constructor parameter's keyClass is always LongWritable.
I've used reflection to create and cast to the expected class.  This
made it a one liner to change vectors.lucene.Driver to use
IntWritables as keys, and could also be set with a parameter if
needed.  It looks like this class is only used for creating vectors
from Lucene and ARFF.

Is this useful to anyone else?

Ryan

Mime
View raw message