lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "DeploymentofSolrCoreswithZookeeper" by JasonRutherglen
Date Tue, 16 Feb 2010 23:02:54 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "DeploymentofSolrCoreswithZookeeper" page has been changed by JasonRutherglen.
http://wiki.apache.org/solr/DeploymentofSolrCoreswithZookeeper

--------------------------------------------------

New page:
= Deployment of Solr Cores with Zookeeper =

https://issues.apache.org/jira/browse/SOLR-1724
== Architecture ==

Zookeeper may be used as a distributed filesystem to write which Solr servers should be running
which cores.  GSON is the JSON library used to serialize and deserialize objects to and from
the JSON format.  Ephemeral nodes are intentionally not used.  Zookeeper is used as a transactionally
redundant filesystem, not a system for maintaining connections to various servers.  This is
best left to dedicated monitoring services.

== Supported File Types ==

Zipped cores are the standard because they are easier to manage, download, and transfer across
the network.

 * Zipped core accessible via HDFS
 * Zipped core accessible via HTTP

== Zookeeper Filesystem ==

=== Cores ===

Each "cores" file is written to Zookeeper and is of the form cores_N.  This is purposefully
similar to the segment infos files written by Lucene.  The cores is stored in JSON format.

Contents of the cores file:

||Name||Type||
||name||string||
||version||long||
||array||coresinfo||

Each cores info contains:
||Name||Type||
||host||string||
||name||string||
||instanceDir||string||
||configFile||string||
||schemaFile||string||
||dataDir||string||
||url||string||

=== Host ===

Each Solr server (aka host or CoreContainer) must report to Zookeeper which cores it has installed.
 Each host file is of the form host_version.  It is the responsibility of each Solr host/server
to match the state of the cores_N file.  

Contents of a host file:

||Name||Type||
||name||string||
||version||long||
||array||hostinfo||

Each host info contains:
||Name||Type||
||name||string||
||instanceDir||string||
||configFile||string||
||schemaFile||string||
||dataDir||string||
||size||long||
||lastModified||long||

=== Sample Directory Layout ===

There are 2 cores files in this sample directory layout.  Under /production/hosts several
host files have been written.  Actually, all of the necessary hosts files have been written
indicating that for example cores_1 and cores_2 operational definitions have completed.

/production/cores_1<<BR>>
/production/cores_2<<BR>>
/production/hosts/servera_1<<BR>>
/production/hosts/serverb_1<<BR>>
/production/hosts/serverc_1<<BR>>
/production/hosts/serverd_1<<BR>>
/production/hosts/servera_2<<BR>>
/production/hosts/serverb_2<<BR>>
/production/hosts/serverc_2<<BR>>
/production/hosts/serverd_2<<BR>>

== CoreController ==

Core deploy client that lives inside a CoreContainer.  It listens for events on a given path,
finds it's hostname in the latest file by version.  Each cores file is like Lucene's segment
infos file which describes the set of segments that make up the current index.  The cores
file defines the set of cores that should be installed on a given Solr host.

A default root path must be defined, for the unit tests /production is used.  

Mime
View raw message