lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shashi Kant <sk...@sloan.mit.edu>
Subject Re: Lucene search relevancy: Optimization
Date Fri, 07 May 2010 06:32:38 GMT
On Thu, May 6, 2010 at 1:30 PM, Nitin Shiralkar <nitins@coreobjects.com> wrote:
> Hi,
>
> We are using Lucene in a knowledge management system for legal domain. We index lot of
legal documents into Lucene as free-text and also index some metadata information (extracted
from document) into separate fields. Now we want to optimize relevancy for following cases:
>
>
> 1.       Searching with free-text query:
> We want to provide a google-like simple search interface accepting a free-text query.
However in order to achieve better relevancy, we want to map that query against metadata fields.
For example, if the user searches for "California Merger Agreement for telecom" document then
we want to internally search for "California" against State metadata field, "Merger agreement"
against document type field and also complete text as full-text index. What would be the best
way to do that?
>

Look at multifield query. If you have a separate textbox for the state
field, then you can look at Booleanquery and combine the entry in
search textbox to the state textbox in a boolean.

>
> 2.       Returning high-rated documents on top:
>
> We have some high-rated documents in the system and we do store this high-value field
in index. For any type of searches, we want high-value documents to appear on top if they
satisfy search criteria. One of the ways that we are thinking is to sort on high-value field
to get those on top. Is there any other way like boosting etc?
>
>

While indexing you can setBoost for the documents to rank higher.

Document doc = new Document()
doc..AddField(blah...)
doc..AddField(blah...)

//this determines if your document is "high rated"
if(foo == true)
   doc.SetBoost(1000f);

writer.AddDocument(doc);

Mime
View raw message