lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1461) Cached filter for a single term field
Date Fri, 26 Jun 2009 09:21:07 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724443#action_12724443
] 

Michael McCandless commented on LUCENE-1461:
--------------------------------------------

bq.  but a ConstantScoreQuery on this filter would return deleted docs?

You are right!  I guess one workaround is to AND it with a MatchAllDocsQuery, if you are using
ConstantScoreQuery w/o already ANDing it with a query that takes deletions into account.

So there are two issues:

  * This filter returns deleted docs

  * This filter "pretends" deleted docs had a value of zero

This is then only a problem if nothing else in the query applies deletions.  Other FieldCache
driven filters have challenges here, too; eg we just fixed LUCENE-1571 where local lucene
tripped up on deleted docs (because it's using the Strings FieldCache, and hit nulls for deleted
docs)

I'm not sure how we should fix this... (and I think we should open a new issue to do so).
 I don't want to force this filter to always take deletions into account (since for many queries
the filter is "and'd" on, deletions are already factored in).  More generally, we need to
think about what's the "right" top-down way to ask a scorer to take deletions and filters
into account.  Eg, LUCENE-1536 is looking at sizable performance improvements for the "relatively
dense and supports random access" type of filters.


> Cached filter for a single term field
> -------------------------------------
>
>                 Key: LUCENE-1461
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1461
>             Project: Lucene - Java
>          Issue Type: New Feature
>            Reporter: Tim Sturge
>            Assignee: Uwe Schindler
>             Fix For: 2.9
>
>         Attachments: DisjointMultiFilter.java, FieldCacheRangeFilter.patch, LUCENE-1461.patch,
LUCENE-1461.patch, LUCENE-1461.patch, LUCENE-1461.patch, LUCENE-1461a.patch, LUCENE-1461b.patch,
LUCENE-1461c.patch, RangeMultiFilter.java, RangeMultiFilter.java, TermMultiFilter.java, TestFieldCacheRangeFilter.patch
>
>
> These classes implement inexpensive range filtering over a field containing a single
term. They do this by building an integer array of term numbers (storing the term->number
mapping in a TreeMap) and then implementing a fast integer comparison based DocSetIdIterator.
> This code is currently being used to do age range filtering, but could also be used to
do other date filtering or in any application where there need to be multiple filters based
on the same single term field. I have an untested implementation of single term filtering
and have considered but not yet implemented term set filtering (useful for location based
searches) as well. 
> The code here is fairly rough; it works but lacks javadocs and toString() and hashCode()
methods etc. I'm posting it here to discover if there is other interest in this feature; I
don't mind fixing it up but would hate to go to the effort if it's not going to make it into
Lucene.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message