lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <>
Subject Filter to support DocNrSkipper interface
Date Fri, 23 Dec 2005 11:01:52 GMT

Would it be OK to add one method in Filter class that
returns DocNrSkipper interface from Pauls's "Compact
sparse Filter" in jira LUCENE-328

This would be the first step for: 
- smooth integration of compact representations of the
underlaying BitSet in Filter (VInt and sorted int[]).
They are often faster for and/or operations. 
- ChainedFilter (see contrib from Hoss) enhancement
that operates on DocNrSkipper (see And(Or)DocNrSkipper
in Paul's work) 

Compatibility problems do not exist, only BitSet has
to be constructed in bits() method, the same as today
The reasoning that justifies effort in this direction
is that distribution of tokens in typical collection
is perfect for these 3 representations of BitVectors
(Very Low freq tokens in sorted int[],  Very HF tokens
in VInt and the rest in BitSet )

To put it another way, Filter forces us to use BitSet,
which is rather inefficient way to store a few
documents from the big collection.

Any feedback appreceated, could easily happen that I
overlooked something essential.

Cheers, e.

Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message