lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhao, Xin" <xzh...@jhmi.edu>
Subject controlled library
Date Thu, 24 Aug 2006 15:15:25 GMT
Hi,
I have a design question. Here is what we try to do for indexing:
We designed an indexing tool to generate standard MeSH terms from medical citations, and then
use Lucene to save the terms and citations for future search. The information we need to save
are:
a) the exact mesh terms (top 10)
b) the score for each term
so the codings are like
----------------------------------- 
for the top 10 MeSH Terms
 myField=Field.Keyword("mesh", mesh.toLowerCase());
 myField.setBoost(score);
 doc.add(myFiled);
end for
------------------------------------
as you could see we generate all the terms under named field "mesh". If I understand correctly,
all the fields under the same name would eventually  save into one field, with all the scores
be normalized into filed boost. In this case, we wouldn't be able to save separate score,
so the information is lost. Am I correct? Is there anyway we could change it? I understand
Lucene is for keyword search, and what we try to do is Controlled Vocabulary search, Any other
tool we could use?

Thank you,
Xin
  

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message