lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephane Vaucher <vauch...@cirano.qc.ca>
Subject How to store document meta information
Date Mon, 28 Apr 2003 18:22:34 GMT
Hello everyone,

I've got a document that I run through an information extraction engine 
that returns a list of concepts associated to a document with an 
appropriate relevancy factor  (for example, with a news article, it might 
return sport=100%, litterature=84% and politics=10%). 

I would like to index these concepts with an indication of their relevancy 
levels. Is there a recommended way of doing this? Searching the FAQs, I 
found none, but from my knowledge of lucene, I gather I could do it the 
following ways:

1) If all concepts were to be stored in a single field (as I would 
prefer), I don't think I can use field boosting, so I would have to 
probably hold multiple instances of my concept (e.g. I could have 100 
"sport", 84 "litterature" and 10 "politics") in my field.

2) I could use multiple fields with varying boost factors. But I would be 
forced to determine ahead of time how many concepts I'll have to perform 
searches on all of the appropriate fields. This could probably affect the 
performance of the app (I say this with no numbers, simple intuition, so 
correct me if I'm wrong).

Any ideas, pointers or links are appreciated,
sv



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message