lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "paul.elschot (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-328) Some utilities for a compact sparse filter
Date Mon, 15 May 2006 19:30:06 GMT
    [ http://issues.apache.org/jira/browse/LUCENE-328?page=comments#action_12402392 ] 

paul.elschot commented on LUCENE-328:
-------------------------------------

Starting from SkipFilter1.patch as above, a replacement of Filter by SkipFilter in the various
API's
(Searcher, Searchable and implementors) is straightforward. The only thing further needed
is a 
checked cast to Filter in IndexSearcher.search(weight, filter, hitcollector) for the case
when
the DocNrSkipper is null. (When that cast to Filter fails an IllegalArgumentException can
be thrown).
After that, all tests pass again, with and without the test code in Filter to always use a
DocNrSkipper.

That means that it is easier than expected to replace Filter by SkipFilter altogether.


> Some utilities for a compact sparse filter
> ------------------------------------------
>
>          Key: LUCENE-328
>          URL: http://issues.apache.org/jira/browse/LUCENE-328
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: CVS Nightly - Specify date in submission
>  Environment: Operating System: other
> Platform: Other
>     Reporter: paul.elschot
>     Assignee: Lucene Developers
>     Priority: Minor
>  Attachments: AndDocNrSkipper.java, AndDocNrSkipper.java, BitSetSortedIntList.java, DocNrSkipper.java,
DocNrSkipper.java, IntArraySortedIntList.java, IntArraySortedIntList.java, OrDocNrSkipper.java,
OrDocNrSkipper.java, SkipFilter1.patch, SortedVIntList.java, SortedVIntList.java, SortedVIntList.java,
TestDocNrSkippers.java, TestDocNrSkippers.java, TestSortedVIntList.java, TestSortedVIntList.java,
TestSortedVIntList.java
>
> Two files are attached that might form the basis for an alternative 
> filter implementation that is more memory efficient than one bit 
> per doc when less than about 1/8 of the docs pass through the filter. 
>  
> The document numbers are stored in RAM as VInt's from the Lucene index 
> format. These VInt's encode the difference between two successive 
> document numbers, much like a PositionDelta in the Positions: 
> http://jakarta.apache.org/lucene/docs/fileformats.html 
>  
> The getByteSize() method can be used to verify the compression 
> once a SortedVIntList is constructed. 
> The precise conditions under which this is more memory efficient than 
> one bit per document are not easy to specify in advance.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message