mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: LDA from Lucene Indexes
Date Mon, 02 May 2011 22:36:59 GMT
Were your lucene indexes created with term vectors enabled?

On May 2, 2011 3:05 PM, "Chris McConnell" <c.t.mcconnell.ge@gmail.com>
wrote:

Hello all,

We are looking at utilizing LDA for some topic trending off some
pre-built Lucene indexes. I've put the command(s) and output below.
While searching, it seems a lot of people are unable to get this to
work properly. Most answers tell the user to review the example
"build-reuters.sh" but that doesn't utilize a Lucene index for the
input.

The dictionary is created (on local disk) and an attempt at vector
creation is done on HDFS, however no vectors are written out. I'm
interested to know if anyone has actually gotten this to work on
Mahout 0.4. I have (just for testing purposes) then tried to run the
actual LDA on the created directories, however I wouldn't expect it to
work since there are no vectors created.

Thanks,
Chris

bin/mahout lucene.vector --dir /home/index_for_mahout/ --output
/user/vectored_lucene_index --dictOut
/home/vectored_lucene_index/dict.out --weight TF --field content
11/05/02 17:23:57 INFO lucene.Driver: Output File:
/user/vectored_lucene_index
11/05/02 17:23:57 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
11/05/02 17:23:57 INFO zlib.ZlibFactory: Successfully loaded &
initialized native-zlib library
11/05/02 17:23:57 INFO compress.CodecPool: Got brand-new compressor
11/05/02 17:23:58 INFO lucene.Driver: Wrote: 0 vectors
11/05/02 17:23:58 INFO lucene.Driver: Dictionary Output file:
/home/vectored_lucene_index/dict.out
11/05/02 17:23:58 INFO driver.MahoutDriver: Program took 578 ms

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message