lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: Payloads and TrieRangeQuery
Date Wed, 10 Jun 2009 18:28:45 GMT
Hi, sorry I missed the first mail.


The idea we discussed in Amsterdam during ApacheCon was:


Instead of indexing all trie precisions from e.g. the leftmost 8 bits downto
all 64 bits, the TrieTokenStream only creates terms from e.g. precisions 8
to 56. The last precision is left out. Instead the last term (precision 56)
contains the highest precision as payload.

On the query side, TrieRangeQuery would create the filter bitmap as before
until it reaches the lowest available precision with the payloads. Instead
of further splitting this precision into terms, all TermPositions instead of
just TermDocs are listed, but only those set in the result BitSet, that have
the payload inside the range bounds. By this the trie query first selects
large ranges in the middle like before, but uses the highest (but not full
precision term) to select more docids than needed but filters them with the


With String Dates (the simplified example Michael Busch shows in his talk):

Searching all docs from 2005-11-10 to 2008-03-11 with current trierange
variant would select terms 2005-11-10 to 2005-11-30, then the whole
December, the whole years 2006 and 2007 and so on. With payloads, trierange
would select only whole months (November, December, 2006, 2007, Jan, Feb,
Mar). At the ends the payloads are used to filter out the days in Nov 2005
and Mar 2008.


With the latest TrieRange impl this would be possible to implement (because
the TrieTokenStreams now used for indexing could create the payloads). Only
the searching side would no longer so "simple" implemented as yet. My
biggest problem is how to configure this optimal and make the API clean.


Was it understandable? (Its complicated, I know)


Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen


From: Jason Rutherglen [] 
Sent: Wednesday, June 10, 2009 7:59 PM
Subject: Re: Payloads and TrieRangeQuery


I think instead of ORing postings (trie range, rangequery, etc), have a
custom Query + Scorer that examines the payload (somehow)?  It could encode
the multiple levels of trie bits in it?  (I'm just guessing here).

On Wed, Jun 10, 2009 at 4:04 AM, Michael McCandless
<> wrote:

Use them how?  (Sounds interesting...).


On Tue, Jun 9, 2009 at 10:32 PM, Jason
Rutherglen<> wrote:
> At the SF Lucene User's group, Michael Busch mentioned using
> payloads with TrieRangeQueries. Is this something that's being
> worked on? I'm interested in what sort performance benefits
> there would be to this method?

To unsubscribe, e-mail:
For additional commands, e-mail:


View raw message