lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Gindin <vgin...@detectum.com>
Subject Re: Tracking that all query terms are matched in one document
Date Wed, 13 Dec 2017 08:55:04 GMT
Hi Michael,

I've tried to implement such case but faced with the following problem. I
recall, that my Query is combined with several ConstantScoreQuery with
BooleanQuery. I wrote custom Collector as follows:

@Override
public void setScorer(Scorer scorer) throws IOException {
    this.scorer = scorer;

}

@Override
public void collect(int doc) throws IOException {
    System.out.println("doc=" + doc);
    diveIntoScorers(this.scorer);
}

and, when I'm diving recursively to child scorers I'm facing new
UnsupportedOperationException error. It happens because of the following
code in BooleanScorer:

@Override
public int score(LeafCollector collector, Bits acceptDocs, int min,
int max) throws IOException {
  fakeScorer.doc = -1;
  collector.setScorer(fakeScorer);

Later fakeScorer throws an Exception.

How did you implement your similar functionality?
How to avoid this?

Thanks,
Vadim Gindin

On Fri, Dec 8, 2017 at 2:01 PM, Vadim Gindin <vgindin@detectum.com> wrote:

> Thank's for your help. I'll try that.
>
> On Tue, Dec 5, 2017 at 4:18 PM, Mikhail Khludnev <mkhl@apache.org> wrote:
>
>> Vadim,
>> You can create a collector which checks Scorer.getChildren()
>> https://issues.apache.org/jira/browse/LUCENE-7628 but it's way
>> cumbersome.
>> I'd suggest to avoid this if it's possible. However, Elastic does
>> something
>> like this with named queries or so.
>> I've told about this few years ago
>> https://www.youtube.com/watch?v=sGVyUdNGBgw
>>
>> On Tue, Dec 5, 2017 at 12:36 PM, Vadim Gindin <vgindin@detectum.com>
>> wrote:
>>
>> > I'm not sure here that I will be able to track somehow that different
>> terms
>> > were matched to the same document...
>> >
>> > I'm thinking more about little another way: when query scores some
>> document
>> > - save the query term for that document somewhere. Probably it would be
>> > some map in some class SearchContext. I could write something like this:
>> >
>> > SearchContext sc = getSearchContext();                    // -  does
>> such
>> > search context exist in Lucene? Maybe QueryContext
>> > sc.getDocTerms().get(docID).add(query.getTerm()));  // docTerms here
>> is a
>> > Map<Int, List<String>> - where the key - is a document ID and the
value
>> -
>> > is a list of terms by whom this document was matched.
>> >
>> > I need to save somewhere the document ID and the term matched that
>> > document. Could somebody advise me an appropriate place?
>> >
>> > Regards,
>> > Vadim Gindin
>> >
>> >
>> > On Tue, Dec 5, 2017 at 12:04 PM, Vadim Gindin <vgindin@detectum.com>
>> > wrote:
>> >
>> > > For example like this:
>> > >
>> > > BooleanQuery.Builder expected = new BooleanQuery.Builder();
>> > >
>> > > Query param_vendor = new BoostQuery(new ConstantScoreQuery(new
>> > TermQuery(new Term("param_vendor", queryStr))), 5f);
>> > > Query param_model = new BoostQuery(new ConstantScoreQuery(new
>> > TermQuery(new Term("param_model", queryStr))), 5f);
>> > > Query param_value = new BoostQuery(new ConstantScoreQuery(new
>> > TermQuery(new Term("param_value", queryStr))), 3f);
>> > > Query param_name = new BoostQuery(new ConstantScoreQuery(new
>> > TermQuery(new Term("param_name", queryStr))), 4f);
>> > >
>> > > BooleanQuery bq = expected
>> > >         .add(param_vendor, BooleanClause.Occur.SHOULD)
>> > >         .add(param_model, BooleanClause.Occur.SHOULD)
>> > >         .add(param_value, BooleanClause.Occur.SHOULD)
>> > >         .add(param_name, BooleanClause.Occur.SHOULD)
>> > >         .setMinimumNumberShouldMatch(1)
>> > >         .build();
>> > >
>> > > return new BoostQuery(bq, queryBoost);
>> > >
>> > >
>> > > Vadim
>> > >
>> > > On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov <msokolov@gmail.com>
>> > > wrote:
>> > >
>> > >> Well how did you make the original query?
>> > >>
>> > >> On Dec 4, 2017 12:05 PM, "Vadim Gindin" <vgindin@detectum.com>
>> wrote:
>> > >>
>> > >> > Yes, thanks. My question is exactly about how to create "another
>> extra
>> > >> > query that requires all the terms in the original query"
>> > >> >
>> > >> > On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov <
>> msokolov@gmail.com>
>> > >> > wrote:
>> > >> >
>> > >> > > I'm just saying, that when you form your query, you could
also
>> > create
>> > >> > > another extra query that requires all the terms in the original
>> > query,
>> > >> > and
>> > >> > > then combine it with the original query in a boolean where
the
>> > >> original
>> > >> > > query is required and the extra query is optional. That will
>> give a
>> > >> boost
>> > >> > > when all the terms are found, although I think the scores
will be
>> > >> added,
>> > >> > > not multiplied.
>> > >> > >
>> > >> > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" <vgindin@detectum.com>
>> > wrote:
>> > >> > >
>> > >> > > > Thanks, Michael!
>> > >> > > >
>> > >> > > > Yes, I'm sure. Could you explain your proposal in more
detail?
>> > >> > > >
>> > >> > > > Regards,
>> > >> > > > Vadim Gindin
>> > >> > > >
>> > >> > > > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <
>> > msokolov@gmail.com
>> > >> >
>> > >> > > > wrote:
>> > >> > > >
>> > >> > > > > You could combine a Boolean and query with the
same terms,
>> as an
>> > >> > > optional
>> > >> > > > > clause. Are you sure about the requirement to multiply
the
>> score
>> > >> in
>> > >> > > that
>> > >> > > > > case?
>> > >> > > > >
>> > >> > > > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <vgindin@detectum.com
>> >
>> > >> wrote:
>> > >> > > > >
>> > >> > > > > > Hi all.
>> > >> > > > > >
>> > >> > > > > > I need to track that all query terms are matched
in one
>> > >> document.
>> > >> > > When
>> > >> > > > > all
>> > >> > > > > > terms are matched I need to multiply the score
of such
>> > document
>> > >> to
>> > >> > > some
>> > >> > > > > > constant coefficient.
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message