lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Krupansky <jack.krupan...@gmail.com>
Subject Re: Query regarding Lucene
Date Thu, 10 Mar 2016 22:38:55 GMT
Are you calling the IndexSearcher#explain method to get the details of the
score calculation?

How exactly are your results not what you expect?

What Similarity are you using? Scores will be the product of the underlying
calculated scores and you term boost values.

-- Jack Krupansky

On Thu, Mar 10, 2016 at 12:54 AM, Dwaipayan Roy <dwaipayan.roy@gmail.com>
wrote:

> Hello everyone,
>
> I am Dwaipayan, a research scholar from Indian Statistical Institute,
> Kolkata working in the field of Information Retrieval.
> For my research purpose, I use Lucene (4.10.4).
>
> Recently, I am facing a doubt regarding Lucene on how to boost the query
> term at the time of searching. Preciously, I am implementing a paper on
> query expansion (Relevance Based Language Model - Victor Lavrenko, Bruce
> Croft, SIGIR-2001). In the paper, the expanded query is formed with terms
> taken from the initially retrieved documents. The expansion terms are
> selected and weighted following a probability. Thus, the weight of the
> expansion terms are some probability values which are normalized to summed
> into one. This results into making the term weights a small fractional
> decimal value; e.g. for most of the cases, it is some where near to 0.1 if
> 10 expansion terms are added and the weight keeps on reducing if more
> expansion terms are considered.
> When I am using this fractional decimal value as the expansion term weight
> in Lucene BooleanQuery, I am not getting the expected result. I think the
> problem is with the weight that is applied with setBoost()of lucene boolean
> query. Exactly following the paper, I am setting these weights with those
> normalized probability values.
>
> Can anyone of you please help me out in this problem?
>
> Thanks,
> Dwaipayan Roy.
> Research Scholar
> Indian Statistical Institute
> Kolkata, India
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message