lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ype Kingma <>
Subject Re: OpenBitSet
Date Fri, 12 May 2006 18:57:42 GMT
With a copy to java-dev, I suppose none you mind...

On Friday 12 May 2006 19:40, Yonik Seeley wrote:
> On 5/12/06, Doug Cutting <> wrote:
> > Yonik Seeley wrote:
> > > So the first step is to decide if we should migrate to this, and if
> > > so, where it belongs.
> > > - lucene.util?  BitSet is hard-coded into Lucene in enough places that
> > > I don't know if it would be useful to people there or not.
> > > - solr.util?
> > >
> > > The next step would be to actually use it... replacing BitSet with
> > > OpenBitSet in BitDocSet (an alternative would be to create another
> > > DocSet type, but that gets more complicated).
> >
> > Shouldn't we really replace BitSet in Lucene with an interface that
> > OpenBitSet & others implement?
> It depends on what the goal is and what the interface would cover.
> It would useful to have very restrictive small interfaces that do
> specific things, and implementations of these interfaces can wrap
> different underlying data structures.
> For example, there's DocNrSkipper for filtering a query:
> BitSetSortedIntList wraps a BitSet and implements DocNrSkipper.

Is there also a nextSetBit(bitNr) somewhere on ?
This method is essential for filtering a query search.

> We could also have an interface for the creation side...
> SequentialIntListCreator where ids must be added in order,
> or RandomAccessIntListCreator where ids may be added in any order.
> But I don't see OpenBitSet implementing any of these interfaces
> directly, but instead being used as an underlying store for certain
> implementations.
> Did you have something else in mind?
> > This has been raised many times, that
> > Filters should return something that implements an interface, not a
> > BitSet.
> +1 on that...
> > Doing this back-compatibly will be a bit of a pain, but I think
> > the effort is warranted.

A simple way would be to deprecate the methods in IndexSearcher that
take a Filter, recommend to use FilteredQuery instead, and add
a constructor to FilteredQuery that takes a SkipFilter.
This posted full version of FilteredQuery has that:

> Disallowing the non-skipping BooleanScorer would allow use of SkipFilters.

I think the non-skipping BooleanScorer  is only useful on the top level
query search for disjunctions, without any filtering. 
Then non-skipping and filtering never occur together, and a SkipFilter
could always be used instead of a Filter.
Also an implemention of search(HitCollector) for BooleanScorer2
using the BooleanScorer non-skipping implementation would fit nicely,
but that is not straightforward at the moment.

Paul Elschot

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message