lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-1536) if a filter can support random access API, we should use it
Date Mon, 10 Oct 2011 14:45:29 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13124138#comment-13124138
] 

Uwe Schindler commented on LUCENE-1536:
---------------------------------------

bq. one bug is that FilteredQuery in the patch runs some heuristics per segment which determine
how the booleans get set that drive the BS1 versus B2 decision.

How can BS1 and BS2 return different scores, this would be a bug? Theoretically it should
be possible to have one segment with BS1 the other one with BS2.

By the way: That was not different without FilteredQuery in the older patches.

Of course the selection of the right scorer based on out of order should be done based on
scoresDocOutOfOrder returned by the weight. This is a bug in FilteredQuery#Weight. But easy
to fix.

By the way: This was also not different without FilteredQuery in the older patches.
                
> if a filter can support random access API, we should use it
> -----------------------------------------------------------
>
>                 Key: LUCENE-1536
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1536
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/search
>    Affects Versions: 2.4
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>             Fix For: 4.0
>
>         Attachments: CachedFilterIndexReader.java, LUCENE-1536-rewrite.patch, LUCENE-1536-rewrite.patch,
LUCENE-1536-rewrite.patch, LUCENE-1536-rewrite.patch, LUCENE-1536-rewrite.patch, LUCENE-1536-rewrite.patch,
LUCENE-1536-rewrite.patch, LUCENE-1536-rewrite.patch, LUCENE-1536.patch, LUCENE-1536.patch,
LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch,
LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch,
LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch,
LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch,
LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch, LUCENE-1536.patch, changes-yonik-uwe.patch,
luceneutil.patch
>
>
> I ran some performance tests, comparing applying a filter via
> random-access API instead of current trunk's iterator API.
> This was inspired by LUCENE-1476, where we realized deletions should
> really be implemented just like a filter, but then in testing found
> that switching deletions to iterator was a very sizable performance
> hit.
> Some notes on the test:
>   * Index is first 2M docs of Wikipedia.  Test machine is Mac OS X
>     10.5.6, quad core Intel CPU, 6 GB RAM, java 1.6.0_07-b06-153.
>   * I test across multiple queries.  1-X means an OR query, eg 1-4
>     means 1 OR 2 OR 3 OR 4, whereas +1-4 is an AND query, ie 1 AND 2
>     AND 3 AND 4.  "u s" means "united states" (phrase search).
>   * I test with multiple filter densities (0, 1, 2, 5, 10, 25, 75, 90,
>     95, 98, 99, 99.99999 (filter is non-null but all bits are set),
>     100 (filter=null, control)).
>   * Method high means I use random-access filter API in
>     IndexSearcher's main loop.  Method low means I use random-access
>     filter API down in SegmentTermDocs (just like deleted docs
>     today).
>   * Baseline (QPS) is current trunk, where filter is applied as iterator up
>     "high" (ie in IndexSearcher's search loop).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message