lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <cdor...@gmail.com>
Subject Re: Index time boosting seem to have no effect
Date Tue, 01 Feb 2011 12:44:53 GMT
hi, could you post here the Explanation of both results - e.g. using
http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/search/Searcher.html#explain(org.apache.lucene.search.Query,%20int)-
this should help to understand what was done with the boosts, for
start.

On Sat, Jan 29, 2011 at 2:55 PM, brsseb <brsseb@me.com> wrote:

>
> Im using Lucene for searching though a collection of financial instruments
> (stocks, basically). The users should be able to search for a financial
> instrument  based on its name. The financial instruments belong to
> different
> stock exchanges, and we would like to prioritize them so that some
> exchanges
> has a higher weight in the search result than others. So I though this
> would
> be a perfect situation to use document boosting at index time.
>
> But using doc.setBoost() to for example 1.5f for important documents and
> 0.5f for less important doesnt seem to have any effect. Ive inspected the
> index in the utility Luke, and experimented with differerent queries, but I
> still get wrong query result ranking.
>
> For example, if I index two financial instruments, Statoil on Oslo Stock
> Exchange and Sterling on New York Stock Exchange, both of which has a
> ticker
> code field "STL", when the user types in "STL" in the user interface, we
> would like Statoil to be weighted higher than Sterling (since Oslo stock
> exchange is more important for us than NYSE at the moment). But in the
> result I get, Sterling is ranked higher for some reason. Even worse,
> further
> inspection in Luke reveals that they have basically the same score...it
> seams that boosting have no effect at all.
>
> Here is (what I believe to be) the important parts of my indexing code:
>
> ...
> IndexWriter indexWriter = new IndexWriter(index, new
> StandardAnalyzer(Version.LUCENE_30), new
> IndexWriter.MaxFieldLength(MAX_FIELD_LENGTH));
> ...
> doc.add(new Field(attribute, value.toString(), Field.Store.NO,
> Field.Index.ANALYZED_NO_NORMS));
> ...
> float boost = 1.0f;
> if (item.getMic().equals("XOSL")) boost = 1.5F;  // OSLO
> if (item.getMic().equals("XNYS")) boost = 0.5F;  // NYSE
> doc.setBoost(boost);
> ...
>
> Ive written a unit test to verify the ordering, and it currently fails due
> to the boost having no effect:
>
> java.lang.AssertionError:
> Wrong ordering detected. Expected ordering was:
>
>        =>NO0010096985.NOK.XOSL Statoil
>        =>US8591581074.USD.XNYS Sterling Bancorp (NY)
>
> Actual ordering was:
>
>        =>US8591581074.USD.XNYS Sterling Bancorp (NY)
>        =>NO0010096985.NOK.XOSL Statoil
>
>
> Im using Lucene 3.0.2
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Index-time-boosting-seem-to-have-no-effect-tp2369767p2369767.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message