lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Leuski <>
Subject Adding information to an index
Date Sat, 08 Oct 2005 20:31:15 GMT

I'm looking to store some additional information in a Lucene index  
and I'm looking for an advise on how to implement the functionality.  
Specifically, I'm planning to store 1) collection frequency count for  
each term, 2) actual document length for each document (yes, I looked  
at the norm factor, I'm still considering how to adapt it...) 3)  
collection size (total number of terms) for each field 4) vocabulary  
size (number of unique terms) for each field. All this info can be  
computed on the fly, but I would prefer to generate it at the  
indexing time and store somewhere.

I think I figured out how to handle  #1) -- I found a post by Doug  
Cutting about it which pointed me in the right direction.  What to do  
about the rest of the info? I'd like the implementation to  
automatically update the counts as documents are added and deleted  
from the index.

Thank you.

-- Anton

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message