lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ext <>
Subject Quickest way to build a Document - (Keyword, Freq)* map
Date Fri, 14 Feb 2003 14:10:44 GMT

I am using Lucene right now to index several semi-structured documents. I 
recently had to implement a method 'getFrequencyVector()' to simply return 
a mapping of keyword -> frequency from the information already in the 
lucene index. 

I currently maintain the lucene index on basis of the keyword -> (document, 
freq)* mapping. The best solution I could come up with is to iterate over 
all the keywords ( :( ) match my own document identifier and build the 
vector. Any ideas/suggestions? 

Is there a way to speed up the vector computation? It currently takes a 
|k|*|d| where |k| is the total number of keywords indexed and |d| is the 
average number of documents a keyword can occur in. 

Ideally, I would like to have a forward index, document to the pair 
(keyword, frequency) for this application. Thank you in advance for you 
expertise and your time. 

Santosh Dawara 
Graduate Student 
Rochester Instt of Tech

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message