lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (Issue Comment Edited) (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (LUCENE-3533) Nuke SpanFilters and CachingSpanFilter (maybe move to sandbox)
Date Sat, 12 Nov 2011 00:22:51 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148888#comment-13148888
] 

Robert Muir edited comment on LUCENE-3533 at 11/12/11 12:22 AM:
----------------------------------------------------------------

It would be good to get a review on the patch: I think its ok in general.

it removes a lot of stupidity from the spans, except for one case:

the SpanMultiTermQueryWrapper is still not single pass (it simply throws all termcontexts
away).

I thought about how to solve that one too, and I'm convinced its unfixable
because SpanQueries aren't really query trees, its just one query that
calls extractTerms on everything underneath it.

For this reason, even if i made this MTQ one single-pass by allowing TermContexts
to be passed to e.g. SpanOrQuery, it would work, but if you had that query inside
another SpanQuery then it would still do the extra seek like it does now.

But still, with the patch spans are a little better.
                
      was (Author: rcmuir):
    It would be good to get a review on the patch: I think its ok in general.

it removes a lot of stupidity from the spans, except for one case:

the SpanMultiTermQueryWrapper is still not single pass (it simply throws all termcontexts
away).

I thought about how to solve that one too, and I'm convinced its unfixable
because SpanQueries aren't really query trees, its just one query that
calls rewriteTerms on everything underneath it.

For this reason, even if i made this MTQ one single-pass by allowing TermContexts
to be passed to e.g. SpanOrQuery, it would work, but if you had that query inside
another SpanQuery then it would still do the extra seek like it does now.

But still, with the patch spans are a little better.
                  
> Nuke SpanFilters and CachingSpanFilter (maybe move to sandbox)
> --------------------------------------------------------------
>
>                 Key: LUCENE-3533
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3533
>             Project: Lucene - Java
>          Issue Type: Task
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>         Attachments: LUCENE-3533.patch
>
>
> SpanFilters are inefficient and OOM easily (they don't scale at all: Create large Lists
of Objects for every match, also filtering deleted docs is a pain). Some talks with Grant
on Eurocon and also the fact that caching of them is still broken in 3.x (but fixed on trunk)
- I assume nobody uses them, so let's nuke them. They are also in wrong package, so standard
statement: "Die, SpanFilters, die!"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message