lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Updated: (LUCENE-1461) Cached filter for a single term field
Date Fri, 26 Jun 2009 09:29:20 GMT


Uwe Schindler updated LUCENE-1461:

    Attachment: LUCENE-1461.patch

Hey Mike, same time... :-)

I did some recherche and also found out, that a filter's DocIdSet should not list deleted

Because of that, I changed the non-StringIndex (which will never contain strings of deleted
docs because it has a order[]->0 mapping) to use IndexReader.termDocs(null) to lists the
docIds (which is no real problem, as it is just an iterator an a bitset, the additional cost
is low, tested with 10 Mio index).

I also created a superclass for all the iterators working on numbers, to get the termDocs
handled easily. The type-specific iterators ony override a matchDoc() method. StringIndex
iterator stays separate, because it is optimized and has no deleted docs problem as described

This patch also contains tests for all (except byte) types.

I will commit in a day or two.

(an other solution for future would be to have an additional bitset for numeric values in
addition to the native type array (in FieldCache), that holds the information, if the document
had a term available. This would also cover the deleted docs)

> Cached filter for a single term field
> -------------------------------------
>                 Key: LUCENE-1461
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>            Reporter: Tim Sturge
>            Assignee: Uwe Schindler
>             Fix For: 2.9
>         Attachments:, FieldCacheRangeFilter.patch, LUCENE-1461.patch,
LUCENE-1461.patch, LUCENE-1461.patch, LUCENE-1461.patch, LUCENE-1461.patch, LUCENE-1461a.patch,
LUCENE-1461b.patch, LUCENE-1461c.patch,,,,
> These classes implement inexpensive range filtering over a field containing a single
term. They do this by building an integer array of term numbers (storing the term->number
mapping in a TreeMap) and then implementing a fast integer comparison based DocSetIdIterator.
> This code is currently being used to do age range filtering, but could also be used to
do other date filtering or in any application where there need to be multiple filters based
on the same single term field. I have an untested implementation of single term filtering
and have considered but not yet implemented term set filtering (useful for location based
searches) as well. 
> The code here is fairly rough; it works but lacks javadocs and toString() and hashCode()
methods etc. I'm posting it here to discover if there is other interest in this feature; I
don't mind fixing it up but would hate to go to the effort if it's not going to make it into

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message