zookeeper-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From f..@apache.org
Subject svn commit: r1343981 - in /zookeeper/bookkeeper/trunk/doc: bookieRecovery.textile bookkeeperConfig.textile
Date Tue, 29 May 2012 21:00:27 GMT
Author: fpj
Date: Tue May 29 21:00:26 2012
New Revision: 1343981

URL: http://svn.apache.org/viewvc?rev=1343981&view=rev
Log:
BOOKKEEPER-270: Review documentation on bookie cookie (ivank via fpj)


Modified:
    zookeeper/bookkeeper/trunk/doc/bookieRecovery.textile
    zookeeper/bookkeeper/trunk/doc/bookkeeperConfig.textile

Modified: zookeeper/bookkeeper/trunk/doc/bookieRecovery.textile
URL: http://svn.apache.org/viewvc/zookeeper/bookkeeper/trunk/doc/bookieRecovery.textile?rev=1343981&r1=1343980&r2=1343981&view=diff
==============================================================================
--- zookeeper/bookkeeper/trunk/doc/bookieRecovery.textile (original)
+++ zookeeper/bookkeeper/trunk/doc/bookieRecovery.textile Tue May 29 21:00:26 2012
@@ -18,13 +18,13 @@ Notice:    Licensed to the Apache Softwa
 
 h1. Bookie Recovery
 
-p. When a bookie crashes, any ledgers with entries on the bookie potentially become underreplicated.
For this reason, we provide a recovery tool which will ensure that all ledgers which had entries
on the bookie are fully replicated. At the moment, this is not an automatic process. The administrator
must run this tool manually when he sees that the bookie has died. 
+p. When a bookie crashes, any ledgers with entries on the bookie potentially become under-replicated.
For this reason, we provide a recovery tool which will ensure that all ledgers which had entries
on the bookie are fully replicated. At the moment, this is not an automatic process. The administrator
must run this tool manually when he sees that the bookie has died. 
 
-To run recovery, with zk1.example.com as the zookeeper ensemble, and bk3.example.com as the
failed bookie, do the following:
+To run recovery, with zk1.example.com as the zookeeper ensemble, and 192.168.1.10 as the
failed bookie, do the following:
 
-@bookkeeper-server/bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools zk1.example.com:2181
bk3.example.com:3181@
+@bookkeeper-server/bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools zk1.example.com:2181
192.168.1.10:3181@
 
-It is necessary to specify the host and port portion of failed bookie, as this is how it
identifies itself to zookeeper. It is possible to specify a third argument, which is the bookie
to replicate to. If this is omitted, as in our example, a random bookie is chosen for each
ledger fragment. A ledger fragment is a continous sequence of entries in a bookie, which share
the same ensemble. 
+It is necessary to specify the host and port portion of failed bookie, as this is how it
identifies itself to zookeeper. It is possible to specify a third argument, which is the bookie
to replicate to. If this is omitted, as in our example, a random bookie is chosen for each
ledger fragment. A ledger fragment is a continuous sequence of entries in a bookie, which
share the same ensemble. 
 
 The recovery process is as follows.
 
@@ -37,5 +37,5 @@ The recovery process is as follows.
 ### the client reads entries that belong to the ledger fragment from other bookies in the
ensemble and writes them to the selected bookie;
 ### Once all entries have been replicated, the zookeeper metadata for the fragment is updated
to reflect the new ensemble;
 ### The fragment is marked as fully replicated in the recovery tool;
-## Once all ledger fragements are marked as fully replicated, the ledger is marked as fully
replicated;
+## Once all ledger fragments are marked as fully replicated, the ledger is marked as fully
replicated;
 # Once all ledgers are marked as fully replicated, bookie recovery is finished.
\ No newline at end of file

Modified: zookeeper/bookkeeper/trunk/doc/bookkeeperConfig.textile
URL: http://svn.apache.org/viewvc/zookeeper/bookkeeper/trunk/doc/bookkeeperConfig.textile?rev=1343981&r1=1343980&r2=1343981&view=diff
==============================================================================
--- zookeeper/bookkeeper/trunk/doc/bookkeeperConfig.textile (original)
+++ zookeeper/bookkeeper/trunk/doc/bookkeeperConfig.textile Tue May 29 21:00:26 2012
@@ -20,19 +20,19 @@ h1. Running a BookKeeper instance
 
 h2. System requirements
 
-p. A typical BookKeeper installation comprises a set of bookies and a set of ZooKeeper replicas.
The exact number of bookies depends on the quorum mode, desired throughput, and number of
clients using this installation simultaneously. The minimum number of bookies is three for
self-verifying (stores a message authentication code along with each entry) and four for generic
(does not store a message authentication code with each entry), and there is no upper limit
on the number of bookies. Increasing the number of bookies will, in fact, enable higher throughput.

+A typical BookKeeper installation comprises a set of bookies and a set of ZooKeeper replicas.
The exact number of bookies depends on the quorum mode, desired throughput, and number of
clients using this installation simultaneously. The minimum number of bookies is three for
self-verifying (stores a message authentication code along with each entry) and four for generic
(does not store a message authentication code with each entry), and there is no upper limit
on the number of bookies. Increasing the number of bookies will, in fact, enable higher throughput.
 
-p. For performance, we require each server to have at least two disks. It is possible to
run a bookie with a single disk, but performance will be significantly lower in this case.
+For performance, we require each server to have at least two disks. It is possible to run
a bookie with a single disk, but performance will be significantly lower in this case.
 
-p. For ZooKeeper, there is no constraint with respect to the number of replicas. Having a
single machine running ZooKeeper in standalone mode is sufficient for BookKeeper. For resilience
purposes, it might be a good idea to run ZooKeeper in quorum mode with multiple servers. Please
refer to the ZooKeeper documentation for detail on how to configure ZooKeeper with multiple
replicas. 
+For ZooKeeper, there is no constraint with respect to the number of replicas. Having a single
machine running ZooKeeper in standalone mode is sufficient for BookKeeper. For resilience
purposes, it might be a good idea to run ZooKeeper in quorum mode with multiple servers. Please
refer to the ZooKeeper documentation for detail on how to configure ZooKeeper with multiple
replicas.
 
 h2. Running bookies
 
-p. To run a bookie, we execute the following command: 
+To run a bookie, we execute the following command:
 
 @bookkeeper-server/bin/bookkeeper bookie@
 
-p. The configuration parameters can be set in bookkeeper-server/conf/bk_server.conf. 
+The configuration parameters can be set in bookkeeper-server/conf/bk_server.conf.
 
 The important parameters are:
 
@@ -41,11 +41,11 @@ The important parameters are:
 * @journalDir@, Path for Log Device (stores bookie write-ahead log); 
 * @ledgerDir@, Path for Ledger Device (stores ledger entries); 
 
-p. Ideally, @journalDir@ and @ledgerDir@ are each in a different device. See "Bookie Configuration
Parameters":./bookieConfigParams.html for a full list of configuration parameters.
+Ideally, @journalDir@ and @ledgerDir@ are each in a different device. See "Bookie Configuration
Parameters":./bookieConfigParams.html for a full list of configuration parameters.
 
 h3. Upgrading
 
-From time to time, we may make changes to the filesystem layout of the bookie, which are
incompatible with previous versions of bookkeeper and require that directories used with previous
versions are upgraded. If you upgrade your bookkeeper software, and an upgrade is required,
then the bookie will fail to start and print an error such as;
+From time to time, we may make changes to the filesystem layout of the bookie, which are
incompatible with previous versions of bookkeeper and require that directories used with previous
versions are upgraded. If you upgrade your bookkeeper software, and an upgrade is required,
then the bookie will fail to start and print an error such as:
 
 @2012-05-25 10:41:50,494 - ERROR - [main:Bookie@246] - Directory layout version is less than
3, upgrade needed@
 
@@ -68,8 +68,31 @@ BookKeeper uses "slf4j":http://www.slf4j
 @export BOOKIE_LOG_CONF=/tmp/log4j.properties@
 @bookkeeper-server/bin/bookkeeper bookie@
 
+h3. Missing disks or directories
+
+Replacing disks or removing directories accidentally can cause a bookie to fail while trying
to read a ledger fragment which the ledger metadata has claimed exists on the bookie. For
this reason, when a bookie is started for the first time, it's disk configuration is fixed
for the lifetime of that bookie. Any change to the disk configuration of the bookie, such
as a crashed disk or an accidental configuration change, will result in the bookie being unable
to start with the following error:
+
+@2012-05-29 18:19:13,790 - ERROR - [main:BookieServer@314] - Exception running bookie server
: @
+@org.apache.bookkeeper.bookie.BookieException$InvalidCookieException@
+@.......at org.apache.bookkeeper.bookie.Cookie.verify(Cookie.java:82)@
+@.......at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:275)@
+@.......at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:351)@
+
+If the change was the result of an accidental configuration change, the change can be reverted
and the bookie can be restarted. However, if the change cannot be reverted, such as is the
case when you want to add a new disk or replace a disk, the bookie must be wiped and then
all its data re-replicated onto it. To do this, do the following:
+
+# Increment the _bookiePort_ in _bk_server.conf_.
+# Ensure that all directories specified by _journalDirectory_ and _ledgerDirectories_ are
empty.
+# Start the bookie.
+# Run @bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools <zkserver> <oldbookie>
<newbookie>@ to re-replicate data. <oldbookie> and <newbookie> are identified
by their external IP and bookiePort. For example if this process is being run on a bookie
with an external IP of 192.168.1.10, with an old _bookiePort_ of 3181 and a new _bookiePort_
of 3182, and with zookeeper running on _zk1.example.com_, the command to run would be <br/>@bin/bookkeeper
org.apache.bookkeeper.tools.BookKeeperTools zk1.example.com 192.168.1.10:3181 192.168.1.10:3182@.
See "Bookie Recovery":./bookieRecovery.html for more details on the re-replication process.
+
+The mechanism to prevent the bookie from starting up in the case of configuration changes
exists to prevent the following silent failures:
+
+# A strict subset of the ledger devices (among multiple ledger devices) has been replaced,
consequently making the content of the replaced devices unavailable;
+# A strict subset of the ledger directories has been accidentally deleted.
+
 h2. Setting up a test ensemble
 
-Sometimes it is useful to run a ensemble of bookies on your local machine for testing. We
provide a utility for doing this. It will set up N bookies, and a zookeeper instance locally.
The data on these bookies and of the zookeeper instance are not persisted over restarts, so
obviously this should never be used in a production environment. To run a test ensemble of
10 bookies, do the following.
+Sometimes it is useful to run a ensemble of bookies on your local machine for testing. We
provide a utility for doing this. It will set up N bookies, and a zookeeper instance locally.
The data on these bookies and of the zookeeper instance are not persisted over restarts, so
obviously this should never be used in a production environment. To run a test ensemble of
10 bookies, do the following:
 
 @bookkeeper-server/bin/bookkeeper localbookie 10@
+



Mime
View raw message