lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sheng <sheng...@gmail.com>
Subject Re: ConjunctionScorer access
Date Thu, 22 Oct 2015 15:18:31 GMT
Péteri,

The problem is if A or B should be in "excluded_field" is a
posteriori rather than a priori knowledge.
I want the search A `and` B does not return the document as long as one of
the term has score 0, but before search happens, I don't know if any of
them should be "excluded" at all.
This is, by the way, in direct conflicts with the contract of BooleanQuery
which returns the document as long as both A and B exists no matter if any
of them actually has score 0.

The problem really boils down to how BooleanQuery combines the score coming
from the subquery is out of the control from user. Should I not
misinterpret the code, BooleanQuery always sums up the sores,
which is not idiomatic in all scenarios.

>From the perspective of library, I think this is also a useful pattern to
support out of the box.

On Thu, Oct 22, 2015 at 10:50 AM, András Péteri <apeteri@b2international.com
> wrote:

> Going by the example, it looks like you could do something like this:
>
> 1) Use the existing field for adding terms with payloads as before
> ("payload_field");
> 2) Introduce another field ("excluded_field"), adding only those terms
> where you expect a score of zero to be returned (based on the payload);
> 2) Add MUST clauses for term A and term B in "payload_field", and also
> include a MUST_NOT clause matching term A in "excluded_field" to the
> BooleanQuery.
>
> The idea is that documents that have both term A and term B in
> "payload_field" will not necessarily have term A in "excluded_field" --
> only the ones that you don't want to see in the result set.
>
> Regards,
> András
>
> On Thu, Oct 22, 2015 at 4:06 PM, Sheng <shengcer@gmail.com> wrote:
>
> > That's the problem right - none of them are public, and even neither is
> the
> > constructor of `ConjunctionScorer`. Moreover, `ConjunctionScorer` needs
> > access to list of sub-scorers to emit the doc and score. Information like
> > this has to come from the `BooleanWeight`, which is another hack if I
> want
> > to leverage this.
> >
> > On Thu, Oct 22, 2015 at 9:22 AM, Alan Woodward <alan@flax.co.uk> wrote:
> >
> > > You should be able to use a FilterScorer that wraps a ConjunctionScorer
> > > and overrides score().
> > >
> > > Alan Woodward
> > > www.flax.co.uk
> > >
> > >
> > > On 22 Oct 2015, at 13:43, Sheng wrote:
> > >
> > > > Thanks for the reply and suggestion. If I search for term A and term
> B
> > > with
> > > > a BooleanQuery in Lucene, normally Lucene returns documents that
> have a
> > > > match of both A and B. Now I am using payload to vary the scores
> w.r.t
> > > > search of term A and search of term B, so it is possible for example
> a
> > > > document has both match of term A and term B, but only the score for
> > > term A
> > > > is 0. In this case, I want Lucene does not return this document  at
> > all.
> > > > However the internal ConjunctionScorer will just sum up the scores
> > > returned
> > > > by both subquery of A and B, thus the document has a score > 0
> returned
> > > by
> > > > the BooleanQuery, and therefore it cannot be filtered by a
> > > > PositiveScoreOnlyCollector. I know hacking into ConjunctionScorer
> > > probably
> > > > is too intrusive, but wondering if there is a better way to achieve
> the
> > > > same effect ?
> > > >
> > > > On Thu, Oct 22, 2015 at 4:13 AM, Uwe Schindler <uwe@thetaphi.de>
> > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> Those are internal classes and not to be extended (not only the
> > > >> constructor is pkg-private, the whole class is:
> https://goo.gl/5WyLYz
> > )!
> > > >> Scorers follow the delegator pattern. If you want to modify the
> > > behaviour
> > > >> of a Scorer, create a delegator scorer (e.g. some Filtering Scorer)
> > and
> > > >> change its behaviour (e.g. filter additional documents,...). This
> can
> > be
> > > >> done by a query that filters other querys. E.g. look at
> > > ConstantScoreQuery
> > > >> or similar queries that wrap other scorers.
> > > >>
> > > >> Subclassing ConjunctionScorer would bring you nothing because
> > internals
> > > >> are still private - and that's good.
> > > >>
> > > >> Uwe
> > > >>
> > > >> -----
> > > >> Uwe Schindler
> > > >> H.-H.-Meier-Allee 63, D-28213 Bremen
> > > >> http://www.thetaphi.de
> > > >> eMail: uwe@thetaphi.de
> > > >>
> > > >>
> > > >>> -----Original Message-----
> > > >>> From: Sheng [mailto:shengcer@gmail.com]
> > > >>> Sent: Wednesday, October 21, 2015 7:03 PM
> > > >>> To: java-user@lucene.apache.org
> > > >>> Subject: ConjunctionScorer access
> > > >>>
> > > >>> It's a bummer Lucene makes the constructor of ConjunctionScorer
> non-
> > > >>> public. I wanted to extend from this class in order to tweak its
> > > >> behavior for
> > > >>> my use case. Is it possible to change it to protected in future
> > > releases
> > > >> ?
> > > >>
> > > >>
> > > >>
> ---------------------------------------------------------------------
> > > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >>
> > > >>
> > >
> > >
> >
>
> --
> András Péteri
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message