lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eks Dev (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-584) Decouple Filter from BitSet
Date Thu, 14 Sep 2006 10:13:24 GMT
    [ http://issues.apache.org/jira/browse/LUCENE-584?page=comments#action_12434637 ] 
            
Eks Dev commented on LUCENE-584:
--------------------------------

Paul,
What is next now, we did on our app enough experiments and are now sure that this patch causes
no incompatibilities. 
We also tried to replace our filters with OpenBitSet and VInt matchers and results there are
more than good, our app showed crazy  30% speed-up!!! Hard to identify where from exactly,
but I suspect VInt matcher in case of not too dense BitVectors increased our Filter Cache
utilization significantly.

I would propose to commit this patch before we go further with something that would actually
utilize Matcher. Just to avoid creating monster patch on patch ... 

This is ground work, and now using Matcher will be pure poetry, I see a lot of places we could
see beter life by using use Matchers, ConstantScoringQuery, PreffixFilter, ChainedFilter (becomes
obsolete now)... actually replace all uses of BitSet with OpenBitSet (or a bit smarter with
SortedIntList. VInt...)...
Than question here, do we create dependancy to Solr from Lucene, or we "migrate" OpenBitSet
to Lucene (as this dependancy allready exists) or we copy-paste and have two OpenBitSets,
Yonik? As far as I am concerned, makes no real diference.

Do you, or someone else see now things to be done before commiting this? 



> Decouple Filter from BitSet
> ---------------------------
>
>                 Key: LUCENE-584
>                 URL: http://issues.apache.org/jira/browse/LUCENE-584
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.0.1
>            Reporter: Peter Schäfer
>            Priority: Minor
>         Attachments: BitsMatcher.java, Filter-20060628.patch, HitCollector-20060628.patch,
IndexSearcher-20060628.patch, MatchCollector.java, Matcher.java, Matcher20060830b.patch, Scorer-20060628.patch,
Searchable-20060628.patch, Searcher-20060628.patch, Some Matchers.zip, SortedVIntList.java,
TestSortedVIntList.java
>
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable 
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet 
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead
of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only
a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It
would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not
designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still
delegate to =java.util.BitSet=.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message