lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martijn van Groningen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-1143) Return partial results when a connection to a shard is refused
Date Wed, 26 Aug 2009 20:33:59 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748128#action_12748128
] 

Martijn van Groningen commented on SOLR-1143:
---------------------------------------------

Sorry for my confusing comment. I meant to say takeOrError() does return immediately when
an exception occurs. To avoid more confusion I will sketch a situation from what I currently
understand from the code to show that takeOrError() should not be used when returning partial
result.

For each stage a number of requests may be send to the shards and a number of responses may
be returned from the shards for further processing.
Lets say we have three shards and we send a shard request in a certain stage to all three
shards. If the first response contains an error the current behaviour is to return the response
immediately, without adding the two other responses (that did return without an error). Because
of this the so called partial result might contain less data or even nothing.  Therefore I
think take() should be used there. I think takeOrError() is only suitable when not using partial
result.

{code:java}
ShardResponse takeCompletedOrError() {
    while (pending.size() > 0) {
      try {
        Future<ShardResponse> future = completionService.take();
        pending.remove(future);
        ShardResponse rsp = future.get();
        if (rsp.getException() != null) return rsp; // now we return and if there are more
pending results, we lose them
        ...............
        rsp.getShardRequest().responses.add(rsp);
        if (rsp.getShardRequest().responses.size() == rsp.getShardRequest().actualShards.length)
{
          return rsp;
        }
      } catch (InterruptedException e) {
      ......
    }
    return null;
  }
{code}

Again this what I understand from the code. What do you think about this? 

I also did some more thinking about how to improve shard failures. Currently if a shard fails
in a early stage of the distributed search we keep sending requests to the shard, although
we noticed in a previous stage that it was not responding. You think that it is a good idea
to mark a shard as failed, so that it will not use the shard that is marked as failed for
the current running search? 

> Return partial results when a connection to a shard is refused
> --------------------------------------------------------------
>
>                 Key: SOLR-1143
>                 URL: https://issues.apache.org/jira/browse/SOLR-1143
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Nicolas Dessaigne
>            Assignee: Grant Ingersoll
>             Fix For: 1.4
>
>         Attachments: SOLR-1143-2.patch, SOLR-1143-3.patch, SOLR-1143.patch
>
>
> If any shard is down in a distributed search, a ConnectException it thrown.
> Here's a little patch that change this behaviour: if we can't connect to a shard (ConnectException),
we get partial results from the active shards. As for TimeOut parameter (https://issues.apache.org/jira/browse/SOLR-502),
we set the parameter "partialResults" at true.
> This patch also adresses a problem expressed in the mailing list about a year ago (http://www.nabble.com/partialResults,-distributed-search---SOLR-502-td19002610.html)
> We have a use case that needs this behaviour and we would like to know your thougths
about such a behaviour? Should it be the default behaviour for distributed search?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message