lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Gindin <vgin...@detectum.com>
Subject Re: Query in a doc context
Date Sun, 31 Dec 2017 12:06:11 GMT
Thanks Mikhail!

I'll look there.

Happy new year )

Regards
Vadim Gindin

31 дек. 2017 г. 2:21 пользователь "Mikhail Khludnev" <mkhl@apache.org>
написал:

> Literally it's done in Solr (excuse moi) via
> q=field1:(foo bar baz)^=3 field2:(foo bar baz)^=4 field3:(foo bar baz)^=5
> but it's absolutely wrong way to approach the problem, you can find dismax
> and white elephant  problem in the Relevant Search by Mr Turnbull
>
> On Tue, Dec 26, 2017 at 10:01 PM, Vadim Gindin <vgindin@detectum.com>
> wrote:
>
> > Mike,
> >
> > I need the following. I want to create a query using the following
> > information: query string "blah blah blah" and constant scores map:
> >
> > "field1" -> 3.0
> > "field2" -> 4.0
> > "field3" -> 5.0
> >
> > // field1, field2, field3  - fields in the index.
> >
> > The created query should search "blah blah blah" in each specified field.
> > If the search string is found in field1 then query score would be 3.0,
> > field2 -> 4.0 and so on. The final score would be a sum of fields where
> the
> > search string is found.
> >
> > I've implemented that and additional things: like explanation extending
> and
> > composing sum scores.
> >
> > Regards,
> > Vadim Gindin
> >
> >
> > On Fri, Dec 15, 2017 at 10:33 PM, Mike Dinescu (DNQ) <mdinescu@donaq.com
> >
> > wrote:
> >
> > > Got it. I misunderstood the question (actually I'm still not convinced
> I
> > > fully understand what you're looking for). It might be good to give an
> > > example in case others on the mailing list are confused.
> > >
> > > *Mike*
> > >
> > >
> > >
> > > On Thu, Dec 14, 2017 at 8:54 AM, Vadim Gindin <vgindin@detectum.com>
> > > wrote:
> > >
> > > > Mike,
> > > >
> > > > I don't need full doc match. I need a multi-field match and later I
> > need
> > > to
> > > > know - what fields are matched for a document to be able to calculate
> > > other
> > > > multi-fields-oriented metrics.
> > > >
> > > > Regards,
> > > > Vadim Gindin
> > > >
> > > > On Thu, Dec 14, 2017 at 8:46 PM, Mike Dinescu (DNQ) <
> > mdinescu@donaq.com>
> > > > wrote:
> > > >
> > > > > Apologies if I completely misundetstood but if you are looking to
> do
> > a
> > > > full
> > > > > doc match, you could duplicate duplicated the doc into another
> field
> > > that
> > > > > is a true full text index of the document.
> > > > >
> > > > > And search on that. Wouldn't that be exactly what you want?
> > > > >
> > > > > On Thu, Dec 14, 2017 at 6:53 AM Vadim Gindin <vgindin@detectum.com
> >
> > > > wrote:
> > > > >
> > > > > > Thanks Mikhail
> > > > > >
> > > > > > Could you describe your sentences in more detail?
> > > > > >
> > > > > > Vadim
> > > > > >
> > > > > > On Thu, Dec 14, 2017 at 7:08 PM, Mikhail Khludnev <
> mkhl@apache.org
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hello, Vadim.
> > > > > > >
> > > > > > > Please find inline.
> > > > > > >
> > > > > > > On Thu, Dec 14, 2017 at 11:43 AM, Vadim Gindin <
> > > vgindin@detectum.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all.
> > > > > > > >
> > > > > > > > As I can understand. All Queries (or most of them?)
are
> > > > single-field
> > > > > > > > oriented. They may implement different search/score
logic,
> but
> > > they
> > > > > are
> > > > > > > > intended for a single field. For example, simple TermQuery
or
> > > > > > > PhraseQuery.
> > > > > > > > If I need to implement the search through different
fields I
> > > should
> > > > > use
> > > > > > > > BooleanQuery to combine several single-field queries.
> > > > > > > >
> > > > > > > > Did I understand that right?
> > > > > > > >
> > > > > > >
> > > > > > > Absolutely
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > What is an appropriate way to implement a document-wise
> Query?
> > > > > > > >
> > > > > > > > 1. DisjunctionScorer.getChildren() painful doc-at-time
> > handling
> > > > > > > 2. there is a quite promising idea is to amend buffer in
> > > term-at-time
> > > > > > > BooleanScorer to track every doc-term hit.
> > > > > > > 3. probably it can be done by copying all terms into single
> > field,
> > > > but
> > > > > > > storing original field in payloads, but it's reaalllly
slooooww
> > > > > > >
> > > > > > >
> > > > > > > > I need to have the ability to combine fields matching
of one
> > > > document
> > > > > > and
> > > > > > > > analyze it. Particularly - to count whether all query
terms
> are
> > > > > matched
> > > > > > > (to
> > > > > > > > one field or to different fields). I need to be able
to fetch
> > > > > > > corresponding
> > > > > > > > information: what terms are matched to what fields
and so on.
> > > > > > > >
> > > > > > > >
> > > > > > > > It seems, that BooleanQuery/BooleanScorer is not a
good place
> > to
> > > > > > > accumulate
> > > > > > > > some information from a child Queries/Scorers.
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Sincerely yours
> > > > > > > Mikhail Khludnev
> > > > > > >
> > > > > >
> > > > > --
> > > > > *Mike Dinescu*
> > > > > Donaq LLC, Founder
> > > > > +1 (312) 924 0600
> > > > > www.donaq.com
> > > > > http://linkedin.com/company/donaq-llc
> > > > >
> > > > >
> > > > > *CONFIDENTIAL COMMUNICATION:* This message is intended only for the
> > > named
> > > > > recipient(s) above. It may contain confidential information that
is
> > > > > privileged or that constitutes work product of Donaq LLC.  If you
> are
> > > not
> > > > > the intended recipient, you are hereby notified that any
> > dissemination,
> > > > > distribution or copying of this e-mail and any attachment(s) is
> > > strictly
> > > > > prohibited.
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message