lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <>
Subject [jira] Commented: (LUCENE-584) Decouple Filter from BitSet
Date Tue, 10 Apr 2007 22:24:32 GMT


Hoss Man commented on LUCENE-584:

I'm a little behind on following this issue, but if i can attempt to sum up the recent discussion
about performance...

   "Migrating towards a "Matcher" API *may* allow some types of Queries to be faster in situations
where clients can use a MatchCollector instead of a HitCollector, but this won't be a silver
bullet performance win for all Query classes -- just those where some of the score calculations
is (or can be) isolated to the score method (as opposed to skipTO or next)"

I think it's important to remember the motivation of this issue wasn't to improve the speed
performance of non-scoring searchers, it was to decouple the concept of "Filtering" results
away from needing to populate a (potentially large) BitSet when the logic neccessary for Filtering
can easily be expressed in terms of a doc iterator (aka: a Matcher) -- opening up the possibility
of memory performance improvements.  

A second benefit that has arisen as the issue evolved, has been the API generalization of
the "Matcher" concept to be a super class of Scorer for simpler APIs moving forward.

> Decouple Filter from BitSet
> ---------------------------
>                 Key: LUCENE-584
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.0.1
>            Reporter: Peter Schäfer
>            Priority: Minor
>         Attachments: bench-diff.txt, bench-diff.txt,, Filter-20060628.patch,
HitCollector-20060628.patch, IndexSearcher-20060628.patch,,,
Matcher20070226.patch, Scorer-20060628.patch, Searchable-20060628.patch, Searcher-20060628.patch,
> {code}
> package;
> public abstract class Filter implements 
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet 
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead
of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only
a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It
would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not
designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still
delegate to =java.util.BitSet=.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message