lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-584) Decouple Filter from BitSet
Date Wed, 30 Aug 2006 20:49:24 GMT
    [ http://issues.apache.org/jira/browse/LUCENE-584?page=comments#action_12431684 ] 
            
Yonik Seeley commented on LUCENE-584:
-------------------------------------

Thanks Paul,
I like the Matcher/Scorer relation.

It looks like no Filters currently return a matcher, so the current patch just lays the groundwork,
right?

When some filters do start to return a matcher, it looks like support for the 1.4 BooleanScorer
needs to be removed, or a check done in IndexSearcher.search() to disable skipping on the
scorer if it's in use.

I wonder what the performance impact is... for a dense search with a dense bitset filter,
it looks like quite a bit of overhead is added (two calls in order to get the next doc, use
of nextSetBit() instead of get(), checking "exhausted" each time and checking for -1 to set
exhausted).  I suppose one can always drop back to using a HitCollector for special cases
though.

> Decouple Filter from BitSet
> ---------------------------
>
>                 Key: LUCENE-584
>                 URL: http://issues.apache.org/jira/browse/LUCENE-584
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.0.1
>            Reporter: Peter Schäfer
>            Priority: Minor
>         Attachments: BitsMatcher.java, Filter-20060628.patch, HitCollector-20060628.patch,
IndexSearcher-20060628.patch, MatchCollector.java, Matcher.java, Matcher20060830.patch, Scorer-20060628.patch,
Searchable-20060628.patch, Searcher-20060628.patch, SortedVIntList.java, TestSortedVIntList.java
>
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable 
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet 
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead
of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only
a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It
would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not
designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still
delegate to =java.util.BitSet=.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message