lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Dunning (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2358) Distributing Indexing
Date Fri, 07 Oct 2011 21:36:31 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123217#comment-13123217
] 

Ted Dunning commented on SOLR-2358:
-----------------------------------

I think locks should be completely out of bounds if only because they are hell to deal with
in the presence of failures.  This is a major reason that ZK is not a lock manager but supports
atomic updates at a fundamental level.

State of a node doesn't need a lock.  The node should just update it's own state and that
state should be ephemeral so if the node disappears, the state reflects that.  Anybody who
cares in a real-time kind of way about the state of that node should put a watch on that node's
state.

Creating a new collection is relatively trivial without a lock as well.  One of the simplest
ways is to simply put a specification of the new collection into a collections directory in
ZK.  The cluster overseer sees the addition and it parcels out shard assignments to nodes.
 The nodes see the assignments change and they take actions to conform to the specification,
advertising their progress in their state files.  All that is needed here is atomic update
which ZK does just fine.

If it helps, there is a simplified form of this in Chapter 16 of Mahout in action.  The source
code for this example is available at https://github.com/tdunning/Chapter-16.  This example
only has nodes, but the basic idea of parcelling out assignments is the same.

A summary of what I would suggest is this:

- three directories:
{code}
    /collections
    /node-assignments
    /node-states
{code}
The /collections directory is updated by anybody wishing to advertise or delete a collection.
 The node-assignments directory is updated only by the overseer.  The node-states directory
is updated by each node.

- one leader election file
{code}
    /cluster-leader
{code}
All of the potential overseers try to create this file (ephemerally) and insert their IP and
port.  The one that succeeds is the overseer, the others watch for the file to disappear.
 On disconnect from ZK, the overseer stops acting as overseer, but does not tear down local
state.  On reconnect, the overseer continues acting as overseer.  On session expiration, the
overseer tears down local state and attempts to regain the leadership position.

The cluster overseer never needs to grab locks since atomic read-modify-write to node state
is all that is required.  

Again for emphasis,

1) cluster-wide locks are a bug in a scalable clustered system.  Leader election is an allowable
special case.

2) locks are not required for clustered SOLR.

3) a lock-free design is incredibly simple to implement.


                
> Distributing Indexing
> ---------------------
>
>                 Key: SOLR-2358
>                 URL: https://issues.apache.org/jira/browse/SOLR-2358
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud, update
>            Reporter: William Mayor
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2358.patch
>
>
> The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message