mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Limon <julian.li...@tukipa.com>
Subject Re: LDA from Lucene Indexes
Date Wed, 04 May 2011 15:53:51 GMT
This sounds really interesting. Is there a way to dump certain fields from a
Lucene index to text files?

If so, I could use Lucene to do the parsing, and then seqdirectory and
seq2sparse to generate Mahout vectors out of these files.

Thanks,

Julian

2011/5/3 Jake Mannix <jake.mannix@gmail.com>

> On Tue, May 3, 2011 at 6:17 PM, Grant Ingersoll <gsingers@apache.org>
> wrote:
>
> >
> > > Although technically, we could add the capability to take a Store.YES
> > field
> > > and re-tokenize and
> > > build vectors from this as well.
> >
> > True, or we could just dump stored fields out to text and use the
> existing
> > text converter
>
>
> That would probably be the right way to do that, actually.
>
>  -jake
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message