lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (SOLR-1277) Implement a Solr specific naming service (using Zookeeper)
Date Wed, 16 Dec 2009 16:39:18 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791442#action_12791442
] 

Mark Miller edited comment on SOLR-1277 at 12/16/09 4:38 PM:
-------------------------------------------------------------

Yeah, I'm not trying to tackle node selection yet - just client timeouts. But if a client
is going to be periodically updating a node to state its still in good shape, it seems like
it might as well make the update include its current load. Not that thats not something that
can'y be easily added later - I mostly through that in because it was part of the previous
recommendation on how to handle client timeouts.

I don't necessarily like the idea of all of the nodes updating all the time to note their
existence, but it seems like our best option from what I gather now. Otherwise, nodes will
be timing out all the time - and handling the reconnection seems like a pain - if Solr needs
something from ZooKeeper after a GC ends, its going to have to pause and wait for the reconnect.
Or I guess, on every ZooKeeper request, build in a timed retry?

My main concern at the moment is coming up with a plan for these timeouts though. If we raise
the timeout limits, we need another method for determining nodes are down.

I suppose another option might be, its up to a node that can't reach another node to tag it
as unresponsive?

      was (Author: markrmiller@gmail.com):
    Yeah, I'm not trying to tackle node selection yet - just client timeouts. But if a client
is going to be periodically updating a node to state its still in good shape, it seems like
it might as well make the update include its current load. Not that thats not something that
can be easily added later - I mostly through that in because it was part of the previous recommendation
on how to handle client timeouts.

I don't necessarily like the idea of all of the nodes updating all the time to note their
existence, but it seems like our best option from what I gather now. Otherwise, nodes will
be timing out all the time - and handling the reconnection seems like a pain - if Solr needs
something from ZooKeeper after a GC ends, its going to have to pause and wait for the reconnect.
Or I guess, on every ZooKeeper request, build in a timed retry?

My main concern at the moment is coming up with a plan for these timeouts though. If we raise
the timeout limits, we need another method for determining nodes are down.

I suppose another option might be, its up to a node that can't reach another node to tag it
as unresponsive?
  
> Implement a Solr specific naming service (using Zookeeper)
> ----------------------------------------------------------
>
>                 Key: SOLR-1277
>                 URL: https://issues.apache.org/jira/browse/SOLR-1277
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: log4j-1.2.15.jar, SOLR-1277.patch, SOLR-1277.patch, SOLR-1277.patch,
SOLR-1277.patch, zookeeper-3.2.1.jar
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> The goal is to give Solr server clusters self-healing attributes
> where if a server fails, indexing and searching don't stop and
> all of the partitions remain searchable. For configuration, the
> ability to centrally deploy a new configuration without servers
> going offline.
> We can start with basic failover and start from there?
> Features:
> * Automatic failover (i.e. when a server fails, clients stop
> trying to index to or search it)
> * Centralized configuration management (i.e. new solrconfig.xml
> or schema.xml propagates to a live Solr cluster)
> * Optionally allow shards of a partition to be moved to another
> server (i.e. if a server gets hot, move the hot segments out to
> cooler servers). Ideally we'd have a way to detect hot segments
> and move them seamlessly. With NRT this becomes somewhat more
> difficult but not impossible?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message