lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ext <...@prodigy.teamon.com>
Subject Quickest way to build a Document - (Keyword, Freq)* map
Date Fri, 14 Feb 2003 14:10:44 GMT
Hi, 

I am using Lucene right now to index several semi-structured documents. I 
recently had to implement a method 'getFrequencyVector()' to simply return 
a mapping of keyword -> frequency from the information already in the 
lucene index. 

I currently maintain the lucene index on basis of the keyword -> (document, 
freq)* mapping. The best solution I could come up with is to iterate over 
all the keywords ( :( ) match my own document identifier and build the 
vector. Any ideas/suggestions? 

Is there a way to speed up the vector computation? It currently takes a 
|k|*|d| where |k| is the total number of keywords indexed and |d| is the 
average number of documents a keyword can occur in. 

Ideally, I would like to have a forward index, document to the pair 
(keyword, frequency) for this application. Thank you in advance for you 
expertise and your time. 

Cheers, 
Santosh Dawara 
Graduate Student 
Rochester Instt of Tech


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message