lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <eks...@yahoo.co.uk>
Subject Re: [jira] Commented: (LUCENE-584) Decouple Filter from BitSet
Date Tue, 10 Apr 2007 15:41:29 GMT

If I remember well, the last time we profiled search with "high density"  OR queries  scoring
was taking up to 30% of the time. This was a 8Mio collection of short documents fitting comfortably
in RAM. So I am sure disabling scoring in some cases could bring us something. 

I am not all that familiar with scoring inner workings to stand 100% behind this statement,
so please take it with some healthy reserve.

But anyhow, with Matcher in place, we have at least a chance to prove it brings something
for this scenario. For Filtering case it brings definitely a lot. 

on the other note, 
Paul, would it be possible/easy to have something like. It looks easy to add it, but I may
be missing something: 
BooleanQuery.add(Matcher mtr,
    BooleanClause.Occur occur)



----- Original Message ----
From: Otis Gospodnetic (JIRA) <jira@apache.org>
To: java-dev@lucene.apache.org
Sent: Tuesday, 10 April, 2007 5:11:32 PM
Subject: [jira] Commented: (LUCENE-584) Decouple Filter from BitSet


    [ https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487789
] 

Otis Gospodnetic commented on LUCENE-584:
-----------------------------------------

Ah, too bad. :(
Last time I benchmarked Lucene searching on Sun's Niagara vs. non-massive Intel boxes, Intel
boxes with Linux on them actually won, and my impression was that this was due to Niagara's
weak FPU (a known weakness in Niagara, I believe).  Thus, I thought, if we could just skip
scoring and various floating point calculations, we'd see better performance, esp. on Niagara
boxes.

Paul, when you say "fastest cache", what exactly are you referring to?  The Niagara I tested
things on had 32GB of RAM, and I gave the JVM 20+GB, so at least the JVM had plenty of RAM
to work with.


> Decouple Filter from BitSet
> ---------------------------
>
>                 Key: LUCENE-584
>                 URL: https://issues.apache.org/jira/browse/LUCENE-584
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.0.1
>            Reporter: Peter Schäfer
>            Priority: Minor
>         Attachments: bench-diff.txt, bench-diff.txt, BitsMatcher.java, Filter-20060628.patch,
HitCollector-20060628.patch, IndexSearcher-20060628.patch, MatchCollector.java, Matcher.java,
Matcher20070226.patch, Scorer-20060628.patch, Searchable-20060628.patch, Searcher-20060628.patch,
Some Matchers.zip, SortedVIntList.java, TestSortedVIntList.java
>
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable 
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet 
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead
of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only
a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It
would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not
designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still
delegate to =java.util.BitSet=.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org






      ___________________________________________________________ 
Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for
your free account today http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07.html


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message