lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Dunning (Commented) (JIRA)" <>
Subject [jira] [Commented] (SOLR-2765) Shard/Node states
Date Sun, 09 Oct 2011 07:44:29 GMT


Ted Dunning commented on SOLR-2765:

So if I am understanding this we still have the live_nodes but we'd remove the information
under collections in favor of a single node which stores the cloud state in some XML format,

I don't quite think so.  The problem with this is three-fold.  

A) It is very useful to have each ZK structure contain information that flows generally in
one direction so that the updates to the information are relatively easy to understand.  

B) It is important to have at least a directory containing ephemeral files to describe which
nodes are currently live.  

C) It is nice to separate information bound for different destinations so that the number
of notifications is minimized.

With these considerations in mind, I think it is better to have three basic structures:

1) A directory of per node assignments to be written by the overseer but watched and read
by each node separately.  By (A), this should only be written by the overseer and only read
by the nodes.  By (C) the assignment for each node should be in a separate file in ZK.

2) A directory of per node status files to be created and updated by the nodes, but watched
and read by the overseer.  By (A), the nodes write and the overseer reads.  By (B) and (C),
there is an ephemeral file per node.  The files should be named with a key that can be used
to find the corresponding status clause in the cluster status file.  The contents of the files
should probably be the status clause itself if we expect the overseer to update the cluster
file or can be empty if the nodes update the status file.

3) A composite cluster status file created and maintained by the overseer, but possibly also
updated by nodes.  The overseer will have to update this when nodes disappear, but it would
avoid data duplication to have the nodes directly write their status to the cluster status
file in slight violation of (A), but we can pretend that only the nodes update the file and
the clients read and watch it with the overseer only stepping in to do updates that crashing
nodes should do.  Neither (B) nor (C) since this information is bound for all clients.

Note that no locks are necessary.  Updates to (1) are only by the overseer.  Updates to (2)
are only by the nodes and no two nodes will touch the same files.  Updates to (3) are either
by the nodes directly with a versioned update or by the overseer when it sees changes to (2).
 The former strategy is probably better.

The process would then be in ZkController.register that the new solr instance gets the current
state, attempts to add its information to the file and do an update to that particular version,
if the update fails (presumably because another solr instance has modified the node resulting
in a version bump) we'd simply repeat this.

But the new nodes would also need to create their corresponding ephemeral file.
Would something like this work for the XML format? We could then iterate through this and
update the CloudState locally whenever a change to this has happened.

  <collectionstate name="collection1">
    <slice name="slice1">
      <shard name="shard1">
          <str name="url">http://.…</str>
          <str name="node_name">node</str>
          <str name="roles">indexer,searcher</str>
I think that this is good in principle, but there are few nits that I would raise.

First, common usage outside Solr equates slice and shard.  As such, to avoid confusion by
outsiders, I would recommend that shard here be replaced by the term "replica" since that
seems to be understood inside the Solr community to have the same meaning as outside.  

Secondly, is the "props" element necessary?  It seems excessive to me.

Thirdly, shouldn't there be a list of IP addresses that might be used to reach the node? 
I think that it is good practice to have multiple addresses for the common case where a single
address might not work.  This happens, for instance, with EC2 where each node has internal
and external addresses that have different costs and accessibility.  In other cases, there
might be multiple paths to a node and it might be nice to use all of them for failure tolerance.
 It is not acceptable to list the node more than once with different IP addresses because
that makes it look like more than one node which can cause problems with load balancing.

Fourthly, I think it should be clarified that the node_name should be used as the key in the
live_nodes directory for the ephemeral file corresponding to the node.

If this seems reasonable I will attempt to make this change.
Pretty close to reasonable in my book.
> Shard/Node states
> -----------------
>                 Key: SOLR-2765
>                 URL:
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud, update
>            Reporter: Yonik Seeley
>             Fix For: 4.0
>         Attachments: combined.patch, incremental_update.patch, scheduled_executors.patch,
> Need state for shards that indicate they are recovering, active/enabled, or disabled.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message