lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From michal samek <samek.mic...@gmail.com>
Subject Re: Payload Matching Query
Date Fri, 21 Jun 2013 13:58:50 GMT
Eventualy, I have choosen yet another solution. I work with those
"payloads" as with synonyms. In my TokenFilter with every occurence of
token with "payload", I inject new term - containing this "payload" with
zeroed PossitionIncrementAttribute. It solves nearly all my issues =)

Thanks everyone for assistance!

*Michal
*


2013/6/20 Shai Erera <serera@gmail.com>

> There are several ways to implement it :
>
> Query as you mentioned. You'd need to implement a Scorer which traverses
> the posting list where the payload exists. The methods you should implement
> are nextDoc() and advance(). You'll also need to traverse
> DocsAndPositionsEnum.
>
> A Filter. That's somewhat easier than a Query, I think. Principally it will
> work the same as Scorer. One benefit is that you can cache filters. So if
> it's common to search for a certain artist, you can cache the documents
> this artist belong to.
>
> Third option is to implement a Collector which filters documents before
> they are sent to whatever other collector aggregates the documents eg
> TopScoreDocCollector. Again, you'll need to traverse the posting list of
> the term that holds the payload, only this time you'll need to implement
> collect() and setNextReader().
>
> Each has pros and cons. They all share the same con though - the payload
> doesn't help to drive the query. Ie, unlike what inverted indexes are meant
> for, this approach cannot tell fast which documents have artist:foo, rather
> you need to traverse all documents until you find one.
>
> I would perhaps give another thought to what's been proposed before, adding
> an artist and city fields to each document. Perhaps if you tell us more
> what you're trying to achieve in the end, we can help you structure your
> index better.
>
> Shai
> On Jun 20, 2013 8:12 PM, "michal samek" <samek.michal@gmail.com> wrote:
>
> > Well, with this solution you won't be able to search for near occurences
> of
> > payloads - as with NearSpanQueries :-/ I just need to store some
> searchable
> > data with terms, not with documents.
> >
> > But why not implement totally new Query? I'm very new to Lucene, so I've
> > got no idea what it involves, how indices are structured, how searching
> is
> > implemented... Would it be possible?
> >
> > *Michal
> > *
> >
> >
> > 2013/6/20 Brendan Grainger <brendan.grainger@gmail.com>
> >
> > > Any reason not to have separate artist and city fields? So you would
> > search
> > > for:
> > >
> > > artist:(W. A. Mozart) city:Salzburg
> > >
> > > HTH
> > > Brendan
> > >
> > >
> > > On Thu, Jun 20, 2013 at 12:27 PM, michal samek <samek.michal@gmail.com
> > > >wrote:
> > >
> > > > Hi Adrien,
> > > >
> > > > thanks for your reply. If payloads cannot be used for searching, is
> > there
> > > > any workaround how to achieve similar functionality?
> > > >
> > > > What I'd like to accomplish is to be able to search documents with
> > > contents
> > > > for example
> > > > "W. A. Mozart[artist] was born in Salzburg[city]"
> > > > just by specifying the *payload*s [artist] [city].
> > > >
> > > > Thanks
> > > >
> > > > *Michal
> > > > *
> > > >
> > > >
> > > > 2013/6/20 Adrien Grand <jpountz@gmail.com>
> > > >
> > > > > Hi Michal,
> > > > >
> > > > > Although payloads can be used at query time to customize scoring,
> > they
> > > > > can't be used for searching. Lucene only allows to search on terms.
> > > > >
> > > > > --
> > > > > Adrien
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Brendan Grainger
> > > www.kuripai.com
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message