lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anshum Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance
Date Thu, 14 Aug 2014 23:49:19 GMT

     [ https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Anshum Gupta updated SOLR-5986:
-------------------------------

    Attachment: SOLR-5986.patch

Working on adding more test and to decide on where the ExitableReader actually belongs.

> Don't allow runaway queries from harming Solr cluster health or search performance
> ----------------------------------------------------------------------------------
>
>                 Key: SOLR-5986
>                 URL: https://issues.apache.org/jira/browse/SOLR-5986
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Steve Davids
>            Assignee: Anshum Gupta
>            Priority: Critical
>             Fix For: 4.10
>
>         Attachments: SOLR-5986.patch
>
>
> The intent of this ticket is to have all distributed search requests stop wasting CPU
cycles on requests that have already timed out or are so complicated that they won't be able
to execute. We have come across a case where a nasty wildcard query within a proximity clause
was causing the cluster to enumerate terms for hours even though the query timeout was set
to minutes. This caused a noticeable slowdown within the system which made us restart the
replicas that happened to service that one request, the worst case scenario are users with
a relatively low zk timeout value will have nodes start dropping from the cluster due to long
GC pauses.
> [~amccurry] Built a mechanism into Apache Blur to help with the issue in BLUR-142 (see
commit comment for code, though look at the latest code on the trunk for newer bug fixes).
> Solr should be able to either prevent these problematic queries from running by some
heuristic (possibly estimated size of heap usage) or be able to execute a thread interrupt
on all query threads once the time threshold is met. This issue mirrors what others have discussed
on the mailing list: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3C856ac15f0903272054q2dbdbd19kea3c5ba9e105b9d8@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message