lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-328) Some utilities for a compact sparse filter
Date Sat, 15 Oct 2005 15:31:44 GMT
    [ http://issues.apache.org/jira/browse/LUCENE-328?page=comments#action_12332156 ] 

Yonik Seeley commented on LUCENE-328:
-------------------------------------

I've been working a little on a faster version of BitSet.  That's one place where a stateful
iterator implementing nextDocNr() can be faster than nextDocNr(docNr) , so I would like to
see the nextDocNr() added.



> Some utilities for a compact sparse filter
> ------------------------------------------
>
>          Key: LUCENE-328
>          URL: http://issues.apache.org/jira/browse/LUCENE-328
>      Project: Lucene - Java
>         Type: Improvement
>   Components: Search
>     Versions: CVS Nightly - Specify date in submission
>  Environment: Operating System: other
> Platform: Other
>     Reporter: paul.elschot
>     Assignee: Lucene Developers
>     Priority: Minor
>  Attachments: AndDocNrSkipper.java, AndDocNrSkipper.java, BitSetSortedIntList.java, DocNrSkipper.java,
DocNrSkipper.java, IntArraySortedIntList.java, OrDocNrSkipper.java, OrDocNrSkipper.java, SortedVIntList.java,
SortedVIntList.java, SortedVIntList.java, TestDocNrSkippers.java, TestDocNrSkippers.java,
TestSortedVIntList.java, TestSortedVIntList.java, TestSortedVIntList.java, intIterator.java
>
> Two files are attached that might form the basis for an alternative 
> filter implementation that is more memory efficient than one bit 
> per doc when less than about 1/8 of the docs pass through the filter. 
>  
> The document numbers are stored in RAM as VInt's from the Lucene index 
> format. These VInt's encode the difference between two successive 
> document numbers, much like a PositionDelta in the Positions: 
> http://jakarta.apache.org/lucene/docs/fileformats.html 
>  
> The getByteSize() method can be used to verify the compression 
> once a SortedVIntList is constructed. 
> The precise conditions under which this is more memory efficient than 
> one bit per document are not easy to specify in advance.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message