lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Soeren Pekrul <soeren.pek...@gmx.de>
Subject Re: Lucene & LSA
Date Thu, 14 Dec 2006 09:11:25 GMT
Hello Mario,

I had a similar problem a few weeks ago (thread "How to get Term Weights 
(document term matrix)?", 2006-11-02, 
http://www.gossamer-threads.com/lists/lucene/java-user/41726).

I think there is no simple function creating a document term matrix or 
accessing it. I extracted the matrix from my index and stored the matrix 
in a database.

To create the matrix I iterated the terms and the documents for each term:
TermEnum terms=IndexReader.terms();
while(terms.next()) {
     TermDocs docs=IndexReader.termDocs(terms.term());
     while(docs.next()) {
         //store the term, the document and the weight
         //document frequency: indexreader.docFreq(term)
         //term frequency: termdoc.freq()
     }
}

Sören

mariolone wrote:
> Hi!!!!
> I have a problem:
> i must create a matrix term for document in which every element of the
> matrix it represents the number of occurrences of that term in the document.
> How can I do? 
> Can someone help me?
> Thanks to all....
> 
> P.S. I must applicate LSA to this matrix.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message