lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jamie Johnson (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2765) Shard/Node states
Date Sun, 09 Oct 2011 19:45:29 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123761#comment-13123761
] 

Jamie Johnson commented on SOLR-2765:
-------------------------------------

phew, don't pay attention for a few hours and lots to read....

So I just need to summarize what needs to be done at this point.

So we have 
/live_nodes which does not change
/collections which does not change (although I am not completely sure what we'll use it for
now)
/cloudstate which contains the current cluster state.

/cloudstate is updated by the individual nodes optimistically, if this fails we keep trying
until it can update the right version.
/live_nodes is the same as it is now.

So in my mind I am not sure why we need the /collections instance anymore.  If we maintain
the state of the cluster in /cloudstate but don't remove the nodes that have gone down (as
Mark mentioned) and just update their status, it seems like we should be able to do what we
want.  Now admittedly I have not gone through Ted's comment in detail so perhaps there is
a nugget in there that I am missing.

I have the basics of this running locally except the status of each replica in the cloudstate.
 I am assuming that I can leverage the leader logic that currently exists to add another watcher
on live_node to see update the state when an instance goes down, otherwise the replica itself
is responsible for updating its state (I believe that was what was said).

{code:xml}
<?xml version="1.0" encoding="UTF-8" ?>
<cloudstate>
	<collectionstate name="testcollection">
		<shard name="shard0">
			<replica name="solrtest.local:7574_solr_">
				<str name="roles">searcher</str>
				<str name="node_name">solrtest.local:7574_solr</str>
				<str name="url">http://solrtest.local:7574/solr/</str>
				<status>DISABLED</status>
			</replica>
			<replica name="solrtest.local:7575_solr_">
				<str name="roles">searcher</str>
				<str name="node_name">solrtest.local:7575_solr</str>
				<str name="url">http://solrtest.local:7575/solr/</str>
				<status>DEAD</status>
			</replica>
			<replica name="solrtest.local:8983_solr_">
				<str name="roles">indexer,searcher</str>
				<str name="node_name">solrtest.local:8983_solr</str>
				<str name="url">http://solrtest.local:8983/solr/</str>
				<status>ALIVE</status>
			</replica>
		</shard>
	</collectionstate>
</cloudstate>
{code}

The status is obviously notional since I'm not really populating it now, but I'm thinking
ALIVE, DEAD, DISABLED, any others?

I'll try to put together a patch tonight which captures this (I'll see if I can add the logic
for the leader as well to track replicas that die).
                
> Shard/Node states
> -----------------
>
>                 Key: SOLR-2765
>                 URL: https://issues.apache.org/jira/browse/SOLR-2765
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud, update
>            Reporter: Yonik Seeley
>             Fix For: 4.0
>
>         Attachments: combined.patch, incremental_update.patch, scheduled_executors.patch,
shard-roles.patch
>
>
> Need state for shards that indicate they are recovering, active/enabled, or disabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message