mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Reymont <joe...@gmail.com>
Subject lda on lucene vectors
Date Fri, 11 Mar 2011 19:47:34 GMT
Folks,

How do I run LDA on Lucene vectors?

I extracted the vectors from Lucene with

	mahout lucene.vector --dir /tmp/bar --output /tmp/baz/part-out.vec --field body --idField
docId --dictOut /tmp/baz/dict.out --norm 2

but when I ran LDA on the resulting directory I got an error

	mahout lda -i /tmp/baz -o /tmp/lda -k 5 -v 20

	java.io.IOException: file:/tmp/baz/dict.out not a SequenceFile

I'm not sure what command has to be run to convert it to a sequence file and then run LDA
on the output.

	Thanks in advance, Joel

--------------------------------------------------------------------------
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
---------------------+------------+---------------------------------------
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
---------------------+------------+---------------------------------------




Mime
View raw message