lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "paul.elschot (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-328) Some utilities for a compact sparse filter
Date Sat, 15 Oct 2005 15:22:45 GMT
    [ http://issues.apache.org/jira/browse/LUCENE-328?page=comments#action_12332155 ] 

paul.elschot commented on LUCENE-328:
-------------------------------------

About adding a nextDocNr() without current doc argument to DocNrSkipper:
I considered that but left it out initially for code simplicity in DocNrSkipper implementations.

It's much the same as with Scorer.next() and Scorer.skipTo(docNr), so it would fit
in the environment of Lucene to add nextDocNr() without argument to DocNrSkipper.
In case someone has a real performance advantage of such an addition, it would
be worthwhile to have.

Regards,
Paul Elschot




> Some utilities for a compact sparse filter
> ------------------------------------------
>
>          Key: LUCENE-328
>          URL: http://issues.apache.org/jira/browse/LUCENE-328
>      Project: Lucene - Java
>         Type: Improvement
>   Components: Search
>     Versions: CVS Nightly - Specify date in submission
>  Environment: Operating System: other
> Platform: Other
>     Reporter: paul.elschot
>     Assignee: Lucene Developers
>     Priority: Minor
>  Attachments: AndDocNrSkipper.java, AndDocNrSkipper.java, BitSetSortedIntList.java, DocNrSkipper.java,
DocNrSkipper.java, IntArraySortedIntList.java, OrDocNrSkipper.java, OrDocNrSkipper.java, SortedVIntList.java,
SortedVIntList.java, SortedVIntList.java, TestDocNrSkippers.java, TestDocNrSkippers.java,
TestSortedVIntList.java, TestSortedVIntList.java, TestSortedVIntList.java, intIterator.java
>
> Two files are attached that might form the basis for an alternative 
> filter implementation that is more memory efficient than one bit 
> per doc when less than about 1/8 of the docs pass through the filter. 
>  
> The document numbers are stored in RAM as VInt's from the Lucene index 
> format. These VInt's encode the difference between two successive 
> document numbers, much like a PositionDelta in the Positions: 
> http://jakarta.apache.org/lucene/docs/fileformats.html 
>  
> The getByteSize() method can be used to verify the compression 
> once a SortedVIntList is constructed. 
> The precise conditions under which this is more memory efficient than 
> one bit per document are not easy to specify in advance.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message