Return-Path: Delivered-To: apmail-lucene-solr-dev-archive@minotaur.apache.org Received: (qmail 18381 invoked from network); 16 Dec 2009 22:08:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Dec 2009 22:08:42 -0000 Received: (qmail 4093 invoked by uid 500); 16 Dec 2009 22:08:41 -0000 Delivered-To: apmail-lucene-solr-dev-archive@lucene.apache.org Received: (qmail 3878 invoked by uid 500); 16 Dec 2009 22:08:40 -0000 Mailing-List: contact solr-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-dev@lucene.apache.org Delivered-To: mailing list solr-dev@lucene.apache.org Received: (qmail 3868 invoked by uid 99); 16 Dec 2009 22:08:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Dec 2009 22:08:40 +0000 X-ASF-Spam-Status: No, hits=-10.5 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Dec 2009 22:08:38 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 3D36429A001A for ; Wed, 16 Dec 2009 14:08:18 -0800 (PST) Message-ID: <937750918.1261001298249.JavaMail.jira@brutus> Date: Wed, 16 Dec 2009 22:08:18 +0000 (UTC) From: "Mahadev konar (JIRA)" To: solr-dev@lucene.apache.org Subject: [jira] Commented: (SOLR-1277) Implement a Solr specific naming service (using Zookeeper) In-Reply-To: <1549417276.1247593274929.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SOLR-1277?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D12791= 633#action_12791633 ]=20 Mahadev konar commented on SOLR-1277: ------------------------------------- hi all, this is mahadev from the zookeeper team. One of our users does similar thi= ngs that you guys have been talking about in the above comments. I am not s= ure how close I am to your scenario but Ill give it a shot. Feel free to ig= nore my comments if they sound stupid. One of the things that they do is - = lets say you have a machine A that is running a process P and is part of y= our cluster. The way they track the status of this machine is by having 2 z= nodes (ZNODE1, ZNODE2) in zookeeper. ZNODE1 is an ephemeral node (created b= y P) and the other one (ZNODE2) is a normal node which contains process P = specific data which is updated from time to time by process P (like last t= ime of update, status of process P - good/bad/ok). If an application/user w= ants to access P on machine A, they look at the ephemeral node and the data= is ZNODE2 to see if process P has any problems (not related to zookeeper) = and then the application can decide if process P actually needs to be marke= d dead or not. Say the ephemeral node ZNODE1 is alive but ZNODE2 shows that= process P is in a really bad state, then application will go ahead and mar= k process P as dead. hope this information is of some help! > Implement a Solr specific naming service (using Zookeeper) > ---------------------------------------------------------- > > Key: SOLR-1277 > URL: https://issues.apache.org/jira/browse/SOLR-1277 > Project: Solr > Issue Type: New Feature > Affects Versions: 1.4 > Reporter: Jason Rutherglen > Assignee: Grant Ingersoll > Priority: Minor > Fix For: 1.5 > > Attachments: log4j-1.2.15.jar, SOLR-1277.patch, SOLR-1277.patch, = SOLR-1277.patch, SOLR-1277.patch, zookeeper-3.2.1.jar > > Original Estimate: 672h > Remaining Estimate: 672h > > The goal is to give Solr server clusters self-healing attributes > where if a server fails, indexing and searching don't stop and > all of the partitions remain searchable. For configuration, the > ability to centrally deploy a new configuration without servers > going offline. > We can start with basic failover and start from there? > Features: > * Automatic failover (i.e. when a server fails, clients stop > trying to index to or search it) > * Centralized configuration management (i.e. new solrconfig.xml > or schema.xml propagates to a live Solr cluster) > * Optionally allow shards of a partition to be moved to another > server (i.e. if a server gets hot, move the hot segments out to > cooler servers). Ideally we'd have a way to detect hot segments > and move them seamlessly. With NRT this becomes somewhat more > difficult but not impossible? --=20 This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.