lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leos Literak <liter...@seznam.cz>
Subject Re: boost keywords
Date Thu, 12 Aug 2004 14:43:21 GMT
Don Vaillancourt napsal(a):
> It seems like you know very little about Lucene.  Is this the case or do 
> you have a more specific problem that should be looked at.

Well, I dont consider myself as lucene newbie. ;-)
I am just confused with boosting feature and how to use
it.

Usually when I index some article, I set several Fields
to index or store (like URL, type of object etc). Then
I extract all texts from HTML and store it into indexed
field called "content". Finally I add this Document into
IndexWriter.
During search phase I construct new Query and by default
I search "content" field. User might to create more
advanced query and limit search to specific objects only
(articles, news, hardware ...)

That's primitive use case, I know. But it works well.
But I'd like to make it more powerfull (and precise).
For example to boost content of <h1> tag. Or as in
my previous post, to boost extra information entered
by article author into keywords section.

But how can I do that? There is no support in Document.Field
to mark part of text with different boost factor, is it?
If I know, then I can boost whole Field only. What is the
trick for this?

(I was wondering that it may be solution to create new
indexed field with boosted words and include it into
search - besides "content". But the results were wild,
matches in boosted field had very high score, while
other matches had too small score and there were big
lap between these two classes. E.g. 95%, 94%, 15%, 12%
Was it correct way?)

Can you please help me find out best approach? I dont want
to reinvent wheel, I'd like to reuse experience of more
experienced user :-)

Thanks

Leos


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message