Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9212D200D13 for ; Sat, 16 Sep 2017 05:55:19 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 908361609D2; Sat, 16 Sep 2017 03:55:19 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 15B4D1609D1 for ; Sat, 16 Sep 2017 05:55:12 +0200 (CEST) Received: (qmail 53646 invoked by uid 500); 16 Sep 2017 03:55:11 -0000 Mailing-List: contact commits-help@bookkeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: bookkeeper-dev@bookkeeper.apache.org Delivered-To: mailing list commits@bookkeeper.apache.org Received: (qmail 53637 invoked by uid 99); 16 Sep 2017 03:55:11 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Sep 2017 03:55:11 +0000 Received: by gitbox.apache.org (ASF Mail Server at gitbox.apache.org, from userid 33) id 7BD4D84EF3; Sat, 16 Sep 2017 03:55:09 +0000 (UTC) Date: Sat, 16 Sep 2017 03:55:08 +0000 To: "commits@bookkeeper.apache.org" Subject: [bookkeeper] branch asf-site updated: Issue 505: Remove old generated content from http://bookkeeper.apache.org/docs/ (#512) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Message-ID: <150553410895.28033.640325074413352291@gitbox.apache.org> From: sijie@apache.org Reply-To: "commits@bookkeeper.apache.org" X-Git-Host: gitbox.apache.org X-Git-Repo: bookkeeper X-Git-Refname: refs/heads/asf-site X-Git-Reftype: branch X-Git-Oldrev: 31c373941cd77a37dd93f1fdb032521e81ba4459 X-Git-Newrev: 81fe18212654a5cfccbe8c3d2cc4796e47b89d04 X-Git-Rev: 81fe18212654a5cfccbe8c3d2cc4796e47b89d04 X-Git-NotificationType: ref_changed_plus_diff X-Git-Multimail-Version: 1.5.dev Auto-Submitted: auto-generated archived-at: Sat, 16 Sep 2017 03:55:19 -0000 This is an automated email from the ASF dual-hosted git repository. sijie pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/bookkeeper.git The following commit(s) were added to refs/heads/asf-site by this push: new 81fe182 Issue 505: Remove old generated content from http://bookkeeper.apache.org/docs/ (#512) 81fe182 is described below commit 81fe18212654a5cfccbe8c3d2cc4796e47b89d04 Author: Sijie Guo AuthorDate: Fri Sep 15 20:55:07 2017 -0700 Issue 505: Remove old generated content from http://bookkeeper.apache.org/docs/ (#512) This closes #505 --- content/docs/admin/autorecovery/index.html | 663 --------- content/docs/admin/bookies/index.html | 751 ---------- content/docs/admin/geo-replication/index.html | 533 ------- content/docs/admin/metrics/index.html | 594 -------- content/docs/admin/perf/index.html | 511 ------- content/docs/admin/placement/index.html | 511 ------- content/docs/api/distributedlog-api/index.html | 921 ------------ content/docs/api/ledger-api/index.html | 1023 ------------- content/docs/api/overview/index.html | 523 ------- content/docs/deployment/dcos/index.html | 718 --------- content/docs/deployment/kubernetes/index.html | 517 ------- content/docs/deployment/manual/index.html | 595 -------- content/docs/development/codebase/index.html | 511 ------- content/docs/development/protocol/index.html | 751 ---------- content/docs/example/index.html | 511 ------- content/docs/getting-started/concepts/index.html | 810 ----------- .../docs/getting-started/installation/index.html | 650 --------- .../docs/getting-started/run-locally/index.html | 522 ------- content/docs/reference/cli/index.html | 1516 -------------------- content/docs/reference/config/index.html | 1482 ------------------- content/docs/reference/metrics/index.html | 511 ------- content/docs/security/index.html | 588 -------- content/docs/security/sasl/index.html | 799 ----------- content/docs/security/tls/index.html | 790 ---------- content/docs/security/zookeeper/index.html | 614 -------- 25 files changed, 17915 deletions(-) diff --git a/content/docs/admin/autorecovery/index.html b/content/docs/admin/autorecovery/index.html deleted file mode 100644 index ef2a735..0000000 --- a/content/docs/admin/autorecovery/index.html +++ /dev/null @@ -1,663 +0,0 @@ - - - - Apache BookKeeper - Using AutoRecovery - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
-

When a bookie crashes, all ledgers on that bookie become under-replicated. In order to bring all ledgers in your BookKeeper cluster back to full replication, you’ll need to recover the data from any offline bookies. There are two ways to recover bookies’ data:

- -
    -
  1. Using manual recovery
  2. -
  3. Automatically, using AutoRecovery
  4. -
- -

Manual recovery

- -

You can manually recover failed bookies using the bookkeeper command-line tool. You need to specify:

- -
    -
  • that the org.apache.bookkeeper.tools.BookKeeperTools class needs to be run
  • -
  • an IP and port for your BookKeeper cluster’s ZooKeeper ensemble
  • -
  • the IP and port for the failed bookie
  • -
- -

Here’s an example:

- -
$ bookkeeper-server/bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools \
-  zk1.example.com:2181 \ # IP and port for ZooKeeper
-  192.168.1.10:3181      # IP and port for the failed bookie
-
-
- -

If you wish, you can also specify which bookie you’d like to rereplicate to. Here’s an example:

- -
$ bookkeeper-server/bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools \
-  zk1.example.com:2181 \ # IP and port for ZooKeeper
-  192.168.1.10:3181 \    # IP and port for the failed bookie
-  192.168.1.11:3181      # IP and port for the bookie to rereplicate to
-
-
- -

The manual recovery process

- -

When you initiate a manual recovery process, the following happens:

- -
    -
  1. The client (the process running ) reads the metadata of active ledgers from ZooKeeper.
  2. -
  3. The ledgers that contain segments from the failed bookie in their ensemble are selected.
  4. -
  5. A recovery process is initiated for each ledger in this list and the rereplication process is run for each ledger.
  6. -
  7. Once all the ledgers are marked as fully replicated, bookie recovery is finished.
  8. -
- -

AutoRecovery

- -

AutoRecovery is a process that:

- -
    -
  • automatically detects when a bookie in your BookKeeper cluster has become unavailable and then
  • -
  • rereplicates all the ledgers that were stored on that bookie.
  • -
- -

AutoRecovery can be run in two ways:

- -
    -
  1. On dedicated nodes in your BookKeeper cluster
  2. -
  3. On the same machines on which your bookies are running
  4. -
- -

Running AutoRecovery

- -

You can start up AutoRecovery using the autorecovery command of the bookkeeper CLI tool.

- -
$ bookkeeper-server/bin/bookkeeper autorecovery
-
-
- -
-

The most important thing to ensure when starting up AutoRecovery is that the ZooKeeper connection string specified by the zkServers parameter points to the right ZooKeeper cluster.

-
- -

If you start up AutoRecovery on a machine that is already running a bookie, then the AutoRecovery process will run alongside the bookie on a separate thread.

- -

You can also start up AutoRecovery on a fresh machine if you’d like to create a dedicated cluster of AutoRecovery nodes.

- -

Configuration

- -

There are a handful of AutoRecovery-related configs in the bk_server.conf configuration file. For a listing of those configs, see AutoRecovery settings.

- -

Disable AutoRecovery

- -

You can disable AutoRecovery at any time, for example during maintenance. Disabling AutoRecovery ensures that bookies’ data isn’t unnecessarily rereplicated when the bookie is only taken down for a short period of time, for example when the bookie is being updated or the configuration if being changed.

- -

You can disable AutoRecover using the bookkeeper CLI tool:

- -
$ bookkeeper-server/bin/bookkeeper shell autorecovery -disable
-
-
- -

Once disabled, you can reenable AutoRecovery using the enable shell command:

- -
$ bookkeeper-server/bin/bookkeeper shell autorecovery -enable
-
-
- -

AutoRecovery architecture

- -

AutoRecovery has two components:

- -
    -
  1. The auditor (see the Auditor class) is a singleton node that watches bookies to see if they fail and creates rereplication tasks for the ledgers on failed bookies.
  2. -
  3. The replication worker (see the ReplicationWorker class) runs on each bookie and executes rereplication tasks provided by the auditor.
  4. -
- -

Both of these components run as threads in the AutoRecoveryMain process, which runs on each bookie in the cluster. All recovery nodes participate in leader election—using ZooKeeper—to decide which node becomes the auditor. Nodes that fail to become the auditor watch the elected auditor and run an election process again if they see that the auditor node has failed.

- -

Auditor

- -

The auditor watches all bookies in the cluster that are registered with ZooKeeper. Bookies register with ZooKeeper at startup. If the bookie crashes or is killed, the bookie’s registration in ZooKeeper disappears and the auditor is notified of the change in the list of registered bookies.

- -

When the auditor sees that a bookie has disappeared, it immediately scans the complete ledger list to find ledgers that have data stored on the failed bookie. Once it has a list of ledgers for that bookie, the auditor will publish a rereplication task for each ledger under the /underreplicated/ znode in ZooKeeper.

- -

Replication Worker

- -

Each replication worker watches for tasks being published by the auditor on the /underreplicated/ znode in ZooKeeper. When a new task appears, the replication worker will try to get a lock on it. If it cannot acquire the lock, it will try the next entry. The locks are implemented using ZooKeeper ephemeral znodes.

- -

The replication worker will scan through the rereplication task’s ledger for segments of which its local bookie is not a member. When it finds segments matching this criterion, it will replicate the entries of that segment to the local bookie. If, after this process, the ledger is fully replicated, the ledgers entry under /underreplicated/ is deleted, and the lock is released. If there is a problem replicating, or there are still segments in the ledger which are still underreplicated [...] - -

If the replication worker finds a segment which needs rereplication, but does not have a defined endpoint (i.e. the final segment of a ledger currently being written to), it will wait for a grace period before attempting rereplication. If the segment needing rereplication still does not have a defined endpoint, the ledger is fenced and rereplication then takes place.

- -

This avoids the situation in which a client is writing to a ledger and one of the bookies goes down, but the client has not written an entry to that bookie before rereplication takes place. The client could continue writing to the old segment, even though the ensemble for the segment had changed. This could lead to data loss. Fencing prevents this scenario from happening. In the normal case, the client will try to write to the failed bookie within the grace period, and will have start [...] - -

You can configure this grace period using the openLedgerRereplicationGracePeriod parameter.

- -

The rereplication process

- -

The ledger rereplication process happens in these steps:

- -
    -
  1. The client goes through all ledger segments in the ledger, selecting those that contain the failed bookie.
  2. -
  3. A recovery process is initiated for each ledger segment in this list. -
      -
    1. The client selects a bookie to which all entries in the ledger segment will be replicated; In the case of autorecovery, this will always be the local bookie.
    2. -
    3. The client reads entries that belong to the ledger segment from other bookies in the ensemble and writes them to the selected bookie.
    4. -
    5. Once all entries have been replicated, the zookeeper metadata for the segment is updated to reflect the new ensemble.
    6. -
    7. The segment is marked as fully replicated in the recovery tool.
    8. -
    -
  4. -
  5. Once all ledger segments are marked as fully replicated, the ledger is marked as fully replicated.
  6. -
- - -
- - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/admin/bookies/index.html b/content/docs/admin/bookies/index.html deleted file mode 100644 index 8908ed1..0000000 --- a/content/docs/admin/bookies/index.html +++ /dev/null @@ -1,751 +0,0 @@ - - - - Apache BookKeeper - BookKeeper administration - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - -

A guide to deploying and administering BookKeeper

-
- -
- -
-
-

This document is a guide to deploying, administering, and maintaining BookKeeper. It also discusses best practices and common problems.

- -

Requirements

- -

A typical BookKeeper installation consists of an ensemble of bookies and a ZooKeeper quorum. The exact number of bookies depends on the quorum mode that you choose, desired throughput, and the number of clients using the installation simultaneously.

- -

The minimum number of bookies depends on the type of installation:

- -
    -
  • For self-verifying entries you should run at least three bookies. In this mode, clients store a message authentication code along with each entry.
  • -
  • For generic entries you should run at least four
  • -
- -

There is no upper limit on the number of bookies that you can run in a single ensemble.

- -

Performance

- -

To achieve optimal performance, BookKeeper requires each server to have at least two disks. It’s possible to run a bookie with a single disk but performance will be significantly degraded.

- -

ZooKeeper

- -

There is no constraint on the number of ZooKeeper nodes you can run with BookKeeper. A single machine running ZooKeeper in standalone mode is sufficient for BookKeeper, although for the sake of higher resilience we recommend running ZooKeeper in quorum mode with multiple servers.

- -

Starting and stopping bookies

- -

You can run bookies either in the foreground or in the background, using nohup. You can also run local bookies for development purposes.

- -

To start a bookie in the foreground, use the bookie command of the bookkeeper CLI tool:

- -
$ bookkeeper-server/bin/bookkeeper bookie
-
-
- -

To start a bookie in the background, use the bookkeeper-daemon.sh script and run start bookie:

- -
$ bookkeeper-server/bin/bookkeeper-daemon.sh start bookie
-
-
- -

Local bookies

- -

The instructions above showed you how to run bookies intended for production use. If you’d like to experiment with ensembles of bookies locally, you can use the localbookie command of the bookkeeper CLI tool and specify the number of bookies you’d like to run.

- -

This would spin up a local ensemble of 6 bookies:

- -
$ bookkeeper-server/bin/bookkeeper localbookie 6
-
-
- -
-

When you run a local bookie ensemble, all bookies run in a single JVM process.

-
- -

Configuring bookies

- -

There’s a wide variety of parameters that you can set in the bookie configuration file in bookkeeper-server/conf/bk_server.conf of your BookKeeper installation. A full listing can be found in Bookie configuration.

- -

Some of the more important parameters to be aware of:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
bookiePortThe TCP port that the bookie listens on3181
zkServersA comma-separated list of ZooKeeper servers in hostname:port formatlocalhost:2181
journalDirectoryThe directory where the log device stores the bookie’s write-ahead log (WAL)/tmp/bk-txn
ledgerDirectoriesThe directories where the ledger device stores the bookie’s ledger entries (as a comma-separated list)/tmp/bk-data
- -
-

Ideally, the directories specified journalDirectory and ledgerDirectories should be on difference devices.

-
- -

Logging

- -

BookKeeper uses slf4j for logging, with log4j bindings enabled by default.

- -

To enable logging for a bookie, create a log4j.properties file and point the BOOKIE_LOG_CONF environment variable to the configuration file. Here’s an example:

- -
$ export BOOKIE_LOG_CONF=/some/path/log4j.properties
-$ bookkeeper-server/bin/bookkeeper bookie
-
-
- -

Upgrading

- -

From time to time you may need to make changes to the filesystem layout of bookies—changes that are incompatible with previous versions of BookKeeper and require that directories used with previous versions are upgraded. If a filesystem upgrade is required when updating BookKeeper, the bookie will fail to start and return an error like this:

- -
2017-05-25 10:41:50,494 - ERROR - [main:Bookie@246] - Directory layout version is less than 3, upgrade needed
-
-
- -

BookKeeper provides a utility for upgrading the filesystem. You can perform an upgrade using the upgrade command of the bookkeeper CLI tool. When running bookkeeper upgrade you need to specify one of three flags:

- - - - - - - - - - - - - - - - - - - - - - -
FlagAction
--upgradePerforms an upgrade
--rollbackPerforms a rollback to the initial filesystem version
--finalizeMarks the upgrade as complete
- -

Upgrade pattern

- -

A standard upgrade pattern is to run an upgrade…

- -
$ bookkeeper-server/bin/bookkeeper upgrade --upgrade
-
-
- -

…then check that everything is working normally, then kill the bookie. If everything is okay, finalize the upgrade…

- -
$ bookkeeper-server/bin/bookkeeper upgrade --finalize
-
-
- -

…and then restart the server:

- -
$ bookkeeper-server/bin/bookkeeper bookie
-
-
- -

If something has gone wrong, you can always perform a rollback:

- -
$ bookkeeper-server/bin/bookkeeper upgrade --rollback
-
-
- -

Formatting

- -

You can format bookie metadata in ZooKeeper using the metaformat command of the BookKeeper shell.

- -

By default, formatting is done in interactive mode, which prompts you to confirm the format operation if old data exists. You can disable confirmation using the -nonInteractive flag. If old data does exist, the format operation will abort unless you set the -force flag. Here’s an example:

- -
$ bookkeeper-server/bin/bookkeeper shell metaformat
-
-
- -

You can format the local filesystem data on a bookie using the bookieformat command on each bookie. Here’s an example:

- -
$ bookkeeper-server/bin/bookkeeper shell bookieformat
-
-
- -
-

The -force and -nonInteractive flags are also available for the bookieformat command.

-
- -

AutoRecovery

- -

For a guide to AutoRecovery in BookKeeper, see this doc.

- -

Missing disks or directories

- -

Accidentally replacing disks or removing directories can cause a bookie to fail while trying to read a ledger fragment that, according to the ledger metadata, exists on the bookie. For this reason, when a bookie is started for the first time, its disk configuration is fixed for the lifetime of that bookie. Any change to its disk configuration, such as a crashed disk or an accidental configuration change, will result in the bookie being unable to start. That will throw an error like this:

- -
2017-05-29 18:19:13,790 - ERROR - [main:BookieServer314] – Exception running bookie server : @
-org.apache.bookkeeper.bookie.BookieException$InvalidCookieException
-.......at org.apache.bookkeeper.bookie.Cookie.verify(Cookie.java:82)
-.......at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:275)
-.......at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:351)
-
-
- -

If the change was the result of an accidental configuration change, the change can be reverted and the bookie can be restarted. However, if the change cannot be reverted, such as is the case when you want to add a new disk or replace a disk, the bookie must be wiped and then all its data re-replicated onto it.

- -
    -
  1. Increment the bookiePort parameter in the bk_server.conf
  2. -
  3. Ensure that all directories specified by journalDirectory and ledgerDirectories are empty.
  4. -
  5. Start the bookie.
  6. -
  7. -

    Run the following command to re-replicate the data:

    - -
    $ bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools \
    -  <zkserver> \
    -  <oldbookie> \
    -  <newbookie>
    -
    -
    - -

    The ZooKeeper server, old bookie, and new bookie, are all identified by their external IP and bookiePort (3181 by default). Here’s an example:

    - -
    $ bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools \
    -  zk1.example.com \
    -  192.168.1.10:3181 \
    -  192.168.1.10:3181
    -
    -
    - -

    See the AutoRecovery documentation for more info on the re-replication process.

    -
  8. -
- -
- - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/admin/geo-replication/index.html b/content/docs/admin/geo-replication/index.html deleted file mode 100644 index 190db57..0000000 --- a/content/docs/admin/geo-replication/index.html +++ /dev/null @@ -1,533 +0,0 @@ - - - - Apache BookKeeper - Geo-replication - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - -

Replicate data across BookKeeper clusters

-
- -
- -
-
-

Geo-replication is the replication of data across BookKeeper clusters. In order to enable geo-replication for a group of BookKeeper clusters,

- -

Global ZooKeeper

- -

Setting up a global ZooKeeper quorum is a lot like setting up a cluster-specific quorum. The crucial difference is that

- -

Geo-replication across three clusters

- -

Let’s say that you want to set up geo-replication across clusters in regions A, B, and C. First, the BookKeeper clusters in each region must have their own local (cluster-specific) ZooKeeper quorum.

- -
-

BookKeeper clusters use global ZooKeeper only for metadata storage. Traffic from bookies to ZooKeeper should thus be fairly light in general.

-
- -

The crucial difference between using cluster-specific ZooKeeper and global ZooKeeper is that bookies is that you need to point all bookies to use the global ZooKeeper setup.

- -

Region-aware placement polocy

- -

Autorecovery

- -
- - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/admin/metrics/index.html b/content/docs/admin/metrics/index.html deleted file mode 100644 index 6e19492..0000000 --- a/content/docs/admin/metrics/index.html +++ /dev/null @@ -1,594 +0,0 @@ - - - - Apache BookKeeper - Metric collection - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
-

BookKeeper enables metrics collection through a variety of stats providers.

- -
-

For a full listing of available metrics, see the Metrics reference doc.

-
- -

Stats providers

- -

BookKeeper has stats provider implementations for four five sinks:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ProviderProvider class name
Codahale Metricsorg.apache.bookkeeper.stats.CodahaleMetricsProvider
Prometheusorg.apache.bookkeeper.stats.PrometheusMetricsProvider
Finagleorg.apache.bookkeeper.stats.FinagleStatsProvider
Ostrichorg.apache.bookkeeper.stats.OstrichProvider
Twitter Science Providerorg.apache.bookkeeper.stats.TwitterStatsProvider
- -
-

The Codahale Metrics stats provider is the default provider.

-
- -

Enabling stats providers in bookies

- -

There are two stats-related configuration parameters available for bookies:

- - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
enableStatisticsWhether statistics are enabled for the bookiefalse
statsProviderClassThe stats provider class used by the bookieorg.apache.bookkeeper.stats.CodahaleMetricsProvider
- -

To enable stats:

- -
    -
  • set the enableStatistics parameter to true
  • -
  • set statsProviderClass to the desired provider (see the table above for a listing of classes)
  • -
- - - -
- - -
-
- -
- - - - - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/admin/perf/index.html b/content/docs/admin/perf/index.html deleted file mode 100644 index 210804a..0000000 --- a/content/docs/admin/perf/index.html +++ /dev/null @@ -1,511 +0,0 @@ - - - - Apache BookKeeper - Performance tuning - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
- - -
- - -
-
- -
- - -
-

Performance tuning

-
    -
-
- - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/admin/placement/index.html b/content/docs/admin/placement/index.html deleted file mode 100644 index b578f64..0000000 --- a/content/docs/admin/placement/index.html +++ /dev/null @@ -1,511 +0,0 @@ - - - - Apache BookKeeper - Customized placement policies - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
- - -
- - -
-
- -
- - -
-

Customized placement policies

-
    -
-
- - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/api/distributedlog-api/index.html b/content/docs/api/distributedlog-api/index.html deleted file mode 100644 index fbf6308..0000000 --- a/content/docs/api/distributedlog-api/index.html +++ /dev/null @@ -1,921 +0,0 @@ - - - - Apache BookKeeper - DistributedLog - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - -

A higher-level API for managing BookKeeper entries

-
- -
- -
-
-
-

DistributedLog began its life as a separate project under the Apache Foundation. It was merged into BookKeeper in 2017.

-
- -

The DistributedLog API is an easy-to-use interface for managing BookKeeper entries that enables you to use BookKeeper without needing to interact with ledgers directly.

- -

DistributedLog (DL) maintains sequences of records in categories called logs (aka log streams). Writers append records to DL logs, while readers fetch and process those records.

- -

Architecture

- -

The diagram below illustrates how the DistributedLog API works with BookKeeper:

- -

DistributedLog API

- -

Logs

- -

A log in DistributedLog is an ordered, immutable sequence of log records.

- -

The diagram below illustrates the anatomy of a log stream:

- -

DistributedLog log

- -

Log records

- -

Each log record is a sequence of bytes. Applications are responsible for serializing and deserializing byte sequences stored in log records.

- -

Log records are written sequentially into a log stream and assigned with a a unique sequence number called a DLSN (DistributedLog Sequence Number).

- -

In addition to a DLSN, applications can assign their own sequence number when constructing log records. Application-defined sequence numbers are known as TransactionIDs (or txid). Either a DLSN or a TransactionID can be used for positioning readers to start reading from a specific log record.

- -

Log segments

- -

Each log is broken down into log segments that contain subsets of records. Log segments are distributed and stored in BookKeeper. DistributedLog rolls the log segments based on the configured rolling policy, which be either

- -
    -
  • a configurable period of time (such as every 2 hours), or
  • -
  • a configurable maximum size (such as every 128 MB).
  • -
- -

The data in logs is divided up into equally sized log segments and distributed evenly across bookies. This allows logs to scale beyond a size that would fit on a single server and spreads read traffic across the cluster.

- -

Namespaces

- -

Log streams that belong to the same organization are typically categorized and managed under a namespace. DistributedLog namespaces essentially enable applications to locate log streams. Applications can perform the following actions under a namespace:

- -
    -
  • create streams
  • -
  • delete streams
  • -
  • truncate streams to a given sequence number (either a DLSN or a TransactionID)
  • -
- -

Writers

- -

Through the DistributedLog API, writers write data into logs of their choice. All records are appended into logs in order. The sequencing is performed by the writer, which means that there is only one active writer for a log at any given time.

- -

DistributedLog guarantees correctness when two writers attempt to write to the same log when a network partition occurs using a fencing mechanism in the log segment store.

- -

Write Proxy

- -

Log writers are served and managed in a service tier called the Write Proxy (see the diagram above). The Write Proxy is used for accepting writes from a large number of clients.

- -

Readers

- -

DistributedLog readers read records from logs of their choice, starting with a provided position. The provided position can be either a DLSN or a TransactionID.

- -

Readers read records from logs in strict order. Different readers can read records from different positions in the same log.

- -

Unlike other pub-sub systems, DistributedLog doesn’t record or manage readers’ positions. This means that tracking is the responsibility of applications, as different applications may have different requirements for tracking and coordinating positions. This is hard to get right with a single approach. Distributed databases, for example, might store reader positions along with SSTables, so they would resume applying transactions from the positions store in SSTables. Tracking reader pos [...] - -

Read Proxy

- -

Log records can be cached in a service tier called the Read Proxy to serve a large number of readers. See the diagram above. The Read Proxy is the analogue of the Write Proxy.

- -

Guarantees

- -

The DistributedLog API for BookKeeper provides a number of guarantees for applications:

- -
    -
  • Records written by a writer to a log are appended in the order in which they are written. If a record R1 is written by the same writer as a record R2, R1 will have a smaller sequence number than R2.
  • -
  • Readers see records in the same order in which they are written to the log.
  • -
  • All records are persisted on disk by BookKeeper before acknowledgements, which guarantees durability.
  • -
  • For a log with a replication factor of N, DistributedLog tolerates up to N-1 server failures without losing any records.
  • -
- -

API

- -

Documentation for the DistributedLog API can be found here.

- -
-

At a later date, the DistributedLog API docs will be added here.

-
- - - -
- - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/api/ledger-api/index.html b/content/docs/api/ledger-api/index.html deleted file mode 100644 index 8780d26..0000000 --- a/content/docs/api/ledger-api/index.html +++ /dev/null @@ -1,1023 +0,0 @@ - - - - Apache BookKeeper - The Ledger API - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
-

The ledger API is a lower-level API for BookKeeper that enables you to interact with ledgers directly.

- -

The Java ledger API client

- -

To get started with the Java client for BookKeeper, install the bookkeeper-server library as a dependency in your Java application.

- -
-

For a more in-depth tutorial that involves a real use case for BookKeeper, see the Example application guide.

-
- -

Installation

- -

The BookKeeper Java client library is available via Maven Central and can be installed using Maven, Gradle, and other build tools.

- -

Maven

- -

If you’re using Maven, add this to your pom.xml build configuration file:

- -
<!-- in your <properties> block -->
-<bookkeeper.version>4.5.0</bookkeeper.version>
-
-<!-- in your <dependencies> block -->
-<dependency>
-  <groupId>org.apache.bookkeeper</groupId>
-  <artifactId>bookkeeper-server</artifactId>
-  <version>${bookkeeper.version}</version>
-</dependency>
-
-
- -

Gradle

- -

If you’re using Gradle, add this to your build.gradle build configuration file:

- -
dependencies {
-    compile group: 'org.apache.bookkeeper', name: 'bookkeeper-server', version: '4.5.0'
-}
-
-// Alternatively:
-dependencies {
-    compile 'org.apache.bookkeeper:bookkeeper-server:4.5.0'
-}
-
-
- -

Connection string

- -

When interacting with BookKeeper using the Java client, you need to provide your client with a connection string, for which you have three options:

- -
    -
  • Provide your entire ZooKeeper connection string, for example zk1:2181,zk2:2181,zk3:2181.
  • -
  • Provide a host and port for one node in your ZooKeeper cluster, for example zk1:2181. In general, it’s better to provide a full connection string (in case the ZooKeeper node you attempt to connect to is down).
  • -
  • If your ZooKeeper cluster can be discovered via DNS, you can provide the DNS name, for example my-zookeeper-cluster.com.
  • -
- -

Creating a new client

- -

In order to create a new BookKeeper client object, you need to pass in a connection string. Here is an example client object using a ZooKeeper connection string:

- -
try {
-    String connectionString = "127.0.0.1:2181"; // For a single-node, local ZooKeeper cluster
-    BookKeeper bkClient = new BookKeeper(connectionString);
-} catch (InterruptedException | IOException | KeeperException e) {
-    e.printStackTrace();
-}
-
-
- -
-

If you’re running BookKeeper locally, using the localbookie command, use "127.0.0.1:2181" for your connection string, as in the example above.

-
- -

There are, however, other ways that you can create a client object:

- -
    -
  • -

    By passing in a ClientConfiguration object. Here’s an example:

    - -
    ClientConfiguration config = new ClientConfiguration();
    -config.setZkServers(zkConnectionString);
    -config.setAddEntryTimeout(2000);
    -BookKeeper bkClient = new BookKeeper(config);
    -
    -
    -
  • -
  • -

    By specifying a ClientConfiguration and a ZooKeeper client object:

    - -
    ClientConfiguration config = new ClientConfiguration();
    -config.setAddEntryTimeout(5000);
    -ZooKeeper zkClient = new ZooKeeper(/* client args */);
    -BookKeeper bkClient = new BookKeeper(config, zkClient);
    -
    -
    -
  • -
  • -

    Using the forConfig method:

    - -
    BookKeeper bkClient = BookKeeper.forConfig(conf).build();
    -
    -
    -
  • -
- -

Creating ledgers

- -

The easiest way to create a ledger using the Java client is via the createLedger method, which creates a new ledger synchronously and returns a LedgerHandle. You must specify at least a DigestTypeHere’s an example:

- -
byte[] password = "some-password".getBytes();
-LedgerHandle handle = bkClient.createLedger(BookKeeper.DigestType.MAC, password);
-
-
- -

You can also create ledgers asynchronously

- -

Create ledgers asynchronously

- -
class LedgerCreationCallback implements AsyncCallback.CreateCallback {
-    public void createComplete(int returnCode, LedgerHandle handle, Object ctx) {
-        System.out.println("Ledger successfully created");
-    }
-}
-
-client.asyncCreateLedger(
-        3,
-        2,
-        BookKeeper.DigestType.MAC,
-        password,
-        new LedgerCreationCallback(),
-        "some context"
-);
-
-
- -

Adding entries to ledgers

- -
long entryId = ledger.addEntry("Some entry data".getBytes());
-
-
- -

Add entries asynchronously

- -

Reading entries from ledgers

- -
Enumerator<LedgerEntry> entries = handle.readEntries(1, 99);
-
-
- -

To read all possible entries from the ledger:

- -
Enumerator<LedgerEntry> entries =
-  handle.readEntries(0, handle.getLastAddConfirmed());
-
-while (entries.hasNextElement()) {
-    LedgerEntry entry = entries.nextElement();
-    System.out.println("Successfully read entry " + entry.getId());
-}
-
-
- -

Reading entries after the LastAddConfirmed range

- -

readUnconfirmedEntries allowing to read after the LastAddConfirmed range. -It lets the client read without checking the local value of LastAddConfirmed, so that it is possible to read entries for which the writer has not received the acknowledge yet -For entries which are within the range 0..LastAddConfirmed BookKeeper guarantees that the writer has successfully received the acknowledge. -For entries outside that range it is possible that the writer never received the acknowledge and so there is the risk that the reader is seeing entries before the writer and this could result in a consistency issue in some cases. -With this method you can even read entries before the LastAddConfirmed and entries after it with one call, the expected consistency will be as described above.

- -
Enumerator<LedgerEntry> entries =
-  handle.readUnconfirmedEntries(0, lastEntryIdExpectedToRead);
-
-while (entries.hasNextElement()) {
-    LedgerEntry entry = entries.nextElement();
-    System.out.println("Successfully read entry " + entry.getId());
-}
-
-
- -

Deleting ledgers

- -

Ledgers can also be deleted synchronously or asynchronously.

- -
long ledgerId = 1234;
-
-try {
-    bkClient.deleteLedger(ledgerId);
-} catch (Exception e) {
-  e.printStackTrace();
-}
-
-
- -

Delete entries asynchronously

- -

Exceptions thrown:

- -

*

- -
class DeleteEntryCallback implements AsyncCallback.DeleteCallback {
-    public void deleteComplete() {
-        System.out.println("Delete completed");
-    }
-}
-
-
- -

Simple example

- -
-

For a more involved BookKeeper client example, see the example application below.

-
- -

In the code sample below, a BookKeeper client:

- -
    -
  • creates a ledger
  • -
  • writes entries to the ledger
  • -
  • closes the ledger (meaning no further writes are possible)
  • -
  • re-opens the ledger for reading
  • -
  • reads all available entries
  • -
- -
// Create a client object for the local ensemble. This
-// operation throws multiple exceptions, so make sure to
-// use a try/catch block when instantiating client objects.
-BookKeeper bkc = new BookKeeper("localhost:2181");
-
-// A password for the new ledger
-byte[] ledgerPassword = /* some sequence of bytes, perhaps random */;
-
-// Create a new ledger and fetch its identifier
-LedgerHandle lh = bkc.createLedger(BookKeeper.DigestType.MAC, ledgerPassword);
-long ledgerId = lh.getId();
-
-// Create a buffer for four-byte entries
-ByteBuffer entry = ByteBuffer.allocate(4);
-
-int numberOfEntries = 100;
-
-// Add entries to the ledger, then close it
-for (int i = 0; i < numberOfEntries; i++){
-	entry.putInt(i);
-	entry.position(0);
-	lh.addEntry(entry.array());
-}
-lh.close();
-
-// Open the ledger for reading
-lh = bkc.openLedger(ledgerId, BookKeeper.DigestType.MAC, ledgerPassword);
-
-// Read all available entries
-Enumeration<LedgerEntry> entries = lh.readEntries(0, numberOfEntries - 1);
-
-while(entries.hasMoreElements()) {
-	ByteBuffer result = ByteBuffer.wrap(ls.nextElement().getEntry());
-	Integer retrEntry = result.getInt();
-
-    // Print the integer stored in each entry
-    System.out.println(String.format("Result: %s", retrEntry));
-}
-
-// Close the ledger and the client
-lh.close();
-bkc.close();
-
-
- -

Running this should return this output:

- -
Result: 0
-Result: 1
-Result: 2
-# etc
-
-
- -

Example application

- -

This tutorial walks you through building an example application that uses BookKeeper as the replicated log. The application uses the BookKeeper Java client to interact with BookKeeper.

- -
-

The code for this tutorial can be found in this GitHub repo. The final code for the Dice class can be found here.

-
- -

Setup

- -

Before you start, you will need to have a BookKeeper cluster running locally on your machine. For installation instructions, see Installation.

- -

To start up a cluster consisting of six bookies locally:

- -
$ bookkeeper-server/bin/bookkeeper localbookie 6
-
-
- -

You can specify a different number of bookies if you’d like.

- -

Goal

- -

The goal of the dice application is to have

- -
    -
  • multiple instances of this application,
  • -
  • possibly running on different machines,
  • -
  • all of which display the exact same sequence of numbers.
  • -
- -

In other words, the log needs to be both durable and consistent, regardless of how many bookies are participating in the BookKeeper ensemble. If one of the bookies crashes or becomes unable to communicate with the other bookies in any way, it should still display the same sequence of numbers as the others. This tutorial will show you how to achieve this.

- -

To begin, download the base application, compile and run it.

- -
$ git clone https://github.com/ivankelly/bookkeeper-tutorial.git
-$ mvn package
-$ mvn exec:java -Dexec.mainClass=org.apache.bookkeeper.Dice
-
-
- -

That should yield output that looks something like this:

- -
[INFO] Scanning for projects...
-[INFO]                                                                         
-[INFO] ------------------------------------------------------------------------
-[INFO] Building tutorial 1.0-SNAPSHOT
-[INFO] ------------------------------------------------------------------------
-[INFO]
-[INFO] --- exec-maven-plugin:1.3.2:java (default-cli) @ tutorial ---
-[WARNING] Warning: killAfter is now deprecated. Do you need it ? Please comment on MEXEC-6.
-Value = 4
-Value = 5
-Value = 3
-
-
- -

The base application

- -

The application in this tutorial is a dice application. The Dice class below has a playDice function that generates a random number between 1 and 6 every second, prints the value of the dice roll, and runs indefinitely.

- -
public class Dice {
-    Random r = new Random();
-
-    void playDice() throws InterruptedException {
-        while (true) {
-            Thread.sleep(1000);
-            System.out.println("Value = " + (r.nextInt(6) + 1));
-        }
-    }
-}
-
-
- -

When you run the main function of this class, a new Dice object will be instantiated and then run indefinitely:

- -
public class Dice {
-    // other methods
-
-    public static void main(String[] args) throws InterruptedException {
-        Dice d = new Dice();
-        d.playDice();
-    }
-}
-
-
- -

Leaders and followers (and a bit of background)

- -

To achieve this common view in multiple instances of the program, we need each instance to agree on what the next number in the sequence will be. For example, the instances must agree that 4 is the first number and 2 is the second number and 5 is the third number and so on. This is a difficult problem, especially in the case that any instance may go away at any time, and messages between the instances can be lost or reordered.

- -

Luckily, there are already algorithms to solve this. Paxos is an abstract algorithm to implement this kind of agreement, while Zab and Raft are more practical protocols. This video gives a good overview about how these algorithms usually look. They all have a similar core.

- -

It would be possible to run the Paxos to agree on each number in the sequence. However, running Paxos each time can be expensive. What Zab and Raft do is that they use a Paxos-like algorithm to elect a leader. The leader then decides what the sequence of events should be, putting them in a log, which the other instances can then follow to maintain the same state as the leader.

- -

Bookkeeper provides the functionality for the second part of the protocol, allowing a leader to write events to a log and have multiple followers tailing the log. However, bookkeeper does not do leader election. You will need a zookeeper or raft instance for that purpose.

- -

Why not just use ZooKeeper?

- -

There are a number of reasons:

- -
    -
  1. Zookeeper’s log is only exposed through a tree like interface. It can be hard to shoehorn your application into this.
  2. -
  3. A zookeeper ensemble of multiple machines is limited to one log. You may want one log per resource, which will become expensive very quickly.
  4. -
  5. Adding extra machines to a zookeeper ensemble does not increase capacity nor throughput.
  6. -
- -

Bookkeeper can be seen as a means of exposing ZooKeeper’s replicated log to applications in a scalable fashion. ZooKeeper is still used by BookKeeper, however, to maintain consistency guarantees, though clients don’t need to interact with ZooKeeper directly.

- -

Electing a leader

- -

We’ll use zookeeper to elect a leader. A zookeeper instance will have started locally when you started the localbookie application above. To verify it’s running, run the following command.

- -
$ echo stat | nc localhost 2181
-Zookeeper version: 3.4.6-1569965, built on 02/20/2014 09:09 GMT
-Clients:
- /127.0.0.1:59343[1](queued=0,recved=40,sent=41)
- /127.0.0.1:49354[1](queued=0,recved=11,sent=11)
- /127.0.0.1:49361[0](queued=0,recved=1,sent=0)
- /127.0.0.1:59344[1](queued=0,recved=38,sent=39)
- /127.0.0.1:59345[1](queued=0,recved=38,sent=39)
- /127.0.0.1:59346[1](queued=0,recved=38,sent=39)
-
-Latency min/avg/max: 0/0/23
-Received: 167
-Sent: 170
-Connections: 6
-Outstanding: 0
-Zxid: 0x11
-Mode: standalone
-Node count: 16
-
-
- -

To interact with zookeeper, we’ll use the Curator client rather than the stock zookeeper client. Getting things right with the zookeeper client can be tricky, and curator removes a lot of the pointy corners for you. In fact, curator even provides a leader election recipe, so we need to do very little work to get leader election in our application.

- -
public class Dice extends LeaderSelectorListenerAdapter implements Closeable {
-
-    final static String ZOOKEEPER_SERVER = "127.0.0.1:2181";
-    final static String ELECTION_PATH = "/dice-elect";
-
-    ...
-
-    Dice() throws InterruptedException {
-        curator = CuratorFrameworkFactory.newClient(ZOOKEEPER_SERVER,
-                2000, 10000, new ExponentialBackoffRetry(1000, 3));
-        curator.start();
-        curator.blockUntilConnected();
-
-        leaderSelector = new LeaderSelector(curator, ELECTION_PATH, this);
-        leaderSelector.autoRequeue();
-        leaderSelector.start();
-    }
-
-
- -

In the constructor for Dice, we need to create the curator client. We specify four things when creating the client, the location of the zookeeper service, the session timeout, the connect timeout and the retry policy.

- -

The session timeout is a zookeeper concept. If the zookeeper server doesn’t hear anything from the client for this amount of time, any leases which the client holds will be timed out. This is important in leader election. For leader election, the curator client will take a lease on ELECTION_PATH. The first instance to take the lease will become leader and the rest will become followers. However, their claim on the lease will remain in the cue. If the first instance then goes away, due [...] - -

Finally, you’ll have noticed that Dice now extends LeaderSelectorListenerAdapter and implements Closeable. Closeable is there to close the resource we have initialized in the constructor, the curator client and the leaderSelector. LeaderSelectorListenerAdapter is a callback that the leaderSelector uses to notify the instance that it is now the leader. It is passed as the third argument to the LeaderSelector constructor.

- -
    @Override
-    public void takeLeadership(CuratorFramework client)
-            throws Exception {
-        synchronized (this) {
-            leader = true;
-            try {
-                while (true) {
-                    this.wait();
-                }
-            } catch (InterruptedException ie) {
-                Thread.currentThread().interrupt();
-                leader = false;
-            }
-        }
-    }
-
-
- -

takeLeadership() is the callback called by LeaderSelector when the instance is leader. It should only return when the instance wants to give up leadership. In our case, we never do so we wait on the current object until we’re interrupted. To signal to the rest of the program that we are leader we set a volatile boolean called leader to true. This is unset after we are interrupted.

- -
    void playDice() throws InterruptedException {
-        while (true) {
-            while (leader) {
-                Thread.sleep(1000);
-                System.out.println("Value = " + (r.nextInt(6) + 1)
-                                   + ", isLeader = " + leader);
-            }
-        }
-    }
-
-
- -

Finally, we modify the playDice function to only generate random numbers when it is the leader.

- -

Run two instances of the program in two different terminals. You’ll see that one becomes leader and prints numbers and the other just sits there.

- -

Now stop the leader using Control-Z. This will pause the process, but it won’t kill it. You will be dropped back to the shell in that terminal. After a couple of seconds, the session timeout, you will see that the other instance has become the leader. Zookeeper will guarantee that only one instance is selected as leader at any time.

- -

Now go back to the shell that the original leader was on and wake up the process using fg. You’ll see something like the following:

- -
...
-...
-Value = 4, isLeader = true
-Value = 4, isLeader = true
-^Z
-[1]+  Stopped                 mvn exec:java -Dexec.mainClass=org.apache.bookkeeper.Dice
-$ fg
-mvn exec:java -Dexec.mainClass=org.apache.bookkeeper.Dice
-Value = 3, isLeader = true
-Value = 1, isLeader = false
-
-
- -
- - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/api/overview/index.html b/content/docs/api/overview/index.html deleted file mode 100644 index 8497266..0000000 --- a/content/docs/api/overview/index.html +++ /dev/null @@ -1,523 +0,0 @@ - - - - Apache BookKeeper - The ledger API vs. the DistributedLog API - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
-

BookKeeper offers two APIs that applications can use to interact with it:

- -
    -
  • The ledger API is a lower-level API that enables you to interact with ledgers directly
  • -
  • The DistributedLog API is a higher-level API that provides convenient abstractions.
  • -
- -

Trade-offs

- -

The advantage of the ledger API is that it provides direct access to ledgers and thus enables you to use BookKeeper however you’d like. The disadvantage is that it requires you to manage things like leader election on your own.

- -

The advantage of the DistributedLog API is that it’s easier to use, with semantics resembling a simple key/value store from the standpoint of applications. The disadvantage is that

- -
- - -
-
- -
- - -
-

The ledger API vs. the DistributedLog API

- -
- - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/deployment/dcos/index.html b/content/docs/deployment/dcos/index.html deleted file mode 100644 index 7c32c8c..0000000 --- a/content/docs/deployment/dcos/index.html +++ /dev/null @@ -1,718 +0,0 @@ - - - - Apache BookKeeper - Deploying BookKeeper on DC/OS - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - -

Get up and running easily on an Apache Mesos cluster

-
- -
- -
-
-

DC/OS (the DataCenter Operating System) is a distributed operating system used for deploying and managing applications and systems on Apache Mesos. DC/OS is an open-source tool created and maintained by Mesosphere.

- -

BookKeeper is available as a DC/OS package from the Mesosphere DC/OS Universe.

- -

Prerequisites

- -

In order to run BookKeeper on DC/OS, you will need:

- -
    -
  • DC/OS version 1.8 or higher
  • -
  • A DC/OS cluster with at least three nodes
  • -
  • The DC/OS CLI tool installed
  • -
- -

Each node in your DC/OS-managed Mesos cluster must have at least:

- -
    -
  • 1 CPU
  • -
  • 1 GB of memory
  • -
  • 10 GB of total persistent disk storage
  • -
- -

Installing BookKeeper

- -
$ dcos package install bookkeeper --yes
-
-
- -

This command will:

- -
    -
  • Install the bookkeeper subcommand for the dcos CLI tool
  • -
  • Start a single bookie on the Mesos cluster with the default configuration
  • -
- -

The bookie that is automatically started up uses the host mode of the network and by default exports the service at agent_ip:3181.

- -
-

If you run dcos package install bookkeeper without setting the --yes flag, the install will run in interactive mode. For more information on the package install command, see the DC/OS docs.

-
- -

Services

- -

To watch BookKeeper start up, click on the Services tab in the DC/OS user interface and you should see the bookkeeper package listed:

- -

DC/OS services

- -

Tasks

- -

To see which tasks have started, click on the bookkeeper service and you’ll see an interface that looks like this;

- -

DC/OS tasks

- -

Scaling BookKeeper

- -

Once the first bookie has started up, you can click on the Scale tab to scale up your BookKeeper ensemble by adding more bookies (or scale down the ensemble by removing bookies).

- -

DC/OS scale

- -

ZooKeeper Exhibitor

- -

ZooKeeper contains the information for all bookies in the ensemble. When deployed on DC/OS, BookKeeper uses a ZooKeeper instance provided by DC/OS. You can access a visual UI for ZooKeeper using Exhibitor, which is available at http://master.dcos/exhibitor.

- -

ZooKeeper Exhibitor

- -

You should see a listing of IP/host information for all bookies under the messaging/bookkeeper/ledgers/available node.

- -

Client connections

- -

To connect to bookies running on DC/OS using clients running within your Mesos cluster, you need to specify the ZooKeeper connection string for DC/OS’s ZooKeeper cluster:

- -
master.mesos:2181
-
-
- -

This is the only ZooKeeper host/port you need to include in your connection string. Here’s an example using the Java client:

- -
BookKeeper bkClient = new BookKeeper("master.mesos:2181");
-
-
- -

If you’re connecting using a client running outside your Mesos cluster, you need to supply the public-facing connection string for your DC/OS ZooKeeper cluster.

- -

Configuring BookKeeper

- -

By default, the bookkeeper package will start up a BookKeeper ensemble consisting of one bookie with one CPU, 1 GB of memory, and a 70 MB persistent volume.

- -

You can supply a non-default configuration when installing the package using a JSON file. Here’s an example command:

- -
$ dcos package install bookkeeper \
-  --options=/path/to/config.json
-
-
- -

You can then fetch the current configuration for BookKeeper at any time using the package describe command:

- -
$ dcos package describe bookkeeper \
-  --config
-
-
- -

Available parameters

- -
-

Not all configurable parameters for BookKeeper are available for BookKeeper on DC/OS. Only the parameters show in the table below are available.

-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParamTypeDescriptionDefault
nameStringThe name of the DC/OS service.bookkeeper
cpusIntegerThe number of CPU shares to allocate to each bookie. The minimum is 1.1
instancesIntegerThe number of bookies top run. The minimum is 1.1
memNumberThe memory, in MB, to allocate to each BookKeeper task1024.0 (1 GB)
volume_sizeNumberThe persistent volume size, in MB70
zk_clientStringThe connection string for the ZooKeeper client instancemaster.mesos:2181
service_portIntegerThe BookKeeper export service port, using PORT0 in Marathon3181
- -

Example JSON configuration

- -

Here’s an example JSON configuration object for BookKeeper on DC/OS:

- -
{
-  "instances": 5,
-  "cpus": 3,
-  "mem": 2048.0,
-  "volume_size": 250
-}
-
-
- -

If that configuration were stored in a file called bk-config.json, you could apply that configuration upon installating the BookKeeper package using this command:

- -
$ dcos package install bookkeeper \
-  --options=./bk-config.json
-
-
- -

Uninstalling BookKeeper

- -

You can shut down and uninstall the bookkeeper from DC/OS at any time using the package uninstall command:

- -
$ dcos package uninstall bookkeeper
-Uninstalled package [bookkeeper] version [4.5.0]
-Thank you for using bookkeeper.
-
-
- -
- - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/deployment/kubernetes/index.html b/content/docs/deployment/kubernetes/index.html deleted file mode 100644 index 4728aac..0000000 --- a/content/docs/deployment/kubernetes/index.html +++ /dev/null @@ -1,517 +0,0 @@ - - - - Apache BookKeeper - Deploying BookKeeper on Kubernetes - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
- - -
- - -
-
- -
- - -
-

Deploying BookKeeper on Kubernetes

-
    -
-
- - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/deployment/manual/index.html b/content/docs/deployment/manual/index.html deleted file mode 100644 index 3af82ab..0000000 --- a/content/docs/deployment/manual/index.html +++ /dev/null @@ -1,595 +0,0 @@ - - - - Apache BookKeeper - Manual deployment - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
-

The easiest way to deploy BookKeeper is using schedulers like DC/OS, but you can also deploy BookKeeper clusters manually. A BookKeeper cluster consists of two main components:

- -
    -
  • A ZooKeeper cluster that is used for configuration- and coordination-related tasks
  • -
  • An ensemble of bookies
  • -
- -

ZooKeeper setup

- -

We won’t provide a full guide to setting up a ZooKeeper cluster here. We recommend that you consult this guide in the official ZooKeeper documentation.

- -

Starting up bookies

- -

Once your ZooKeeper cluster is up and running, you can start up as many bookies as you’d like to form a cluster. Before starting up each bookie, you need to modify the bookie’s configuration to make sure that it points to the right ZooKeeper cluster.

- -

On each bookie host, you need to download the BookKeeper package as a tarball. Once you’ve done that, you need to configure the bookie by setting values in the bookkeeper-server/conf/bk_server.conf config file. The one parameter that you will absolutely need to change is the zkServers parameter, which you will need [...] - -

zkServers=100.0.0.1:2181,100.0.0.2:2181,100.0.0.3:2181
-
-
- -
-

A full listing of configurable parameters available in bookkeeper-server/conf/bk_server.conf can be found in the Configuration reference manual.

-
- -

Once the bookie’s configuration is set, you can start it up using the bookie command of the bookkeeper CLI tool:

- -
$ bookkeeper-server/bin/bookkeeper bookie
-
-
- -
-

You can also build BookKeeper by cloning it from source or using Maven.

-
- -

System requirements

- -

The number of bookies you should run in a BookKeeper cluster depends on the quorum mode that you’ve chosen, the desired throughput, and the number of clients using the cluster simultaneously.

- - - - - - - - - - - - - - - - - - -
Quorum typeNumber of bookies
Self-verifying quorum3
Generic4
- -

Increasing the number of bookies will enable higher throughput, and there is no upper limit on the number of bookies.

- -

Cluster metadata setup

- -

Once you’ve started up a cluster of bookies, you need to set up cluster metadata for the cluster by running the following command from any bookie in the cluster:

- -
$ bookkeeper-server/bin/bookkeeper shell metaformat
-
-
- -

You can run in the formatting

- -
-

The metaformat command performs all the necessary ZooKeeper cluster metadata tasks and thus only needs to be run once and from any bookie in the BookKeeper cluster.

-
- -

Once cluster metadata formatting has been completed, your BookKeeper cluster is ready to go!

- - - -
- - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/development/codebase/index.html b/content/docs/development/codebase/index.html deleted file mode 100644 index cc047f7..0000000 --- a/content/docs/development/codebase/index.html +++ /dev/null @@ -1,511 +0,0 @@ - - - - Apache BookKeeper - The BookKeeper codebase - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
- - -
- - -
-
- -
- - -
-

The BookKeeper codebase

-
    -
-
- - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/development/protocol/index.html b/content/docs/development/protocol/index.html deleted file mode 100644 index e922d7c..0000000 --- a/content/docs/development/protocol/index.html +++ /dev/null @@ -1,751 +0,0 @@ - - - - Apache BookKeeper - The BookKeeper protocol - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
-

BookKeeper uses a special replication protocol for guaranteeing persistent storage of entries in an ensemble of bookies.

- -
-

This document assumes that you have some knowledge of leader election and log replication and how these can be used in a distributed system. If not, we recommend reading the example application documentation first.

-
- -

Ledgers

- -

Ledgers are the basic building block of BookKeeper and the level at which BookKeeper makes its persistent storage guarantees. A replicated log consists of an ordered list of ledgers. See Ledgers to logs for info on building a replicated log from ledgers.

- -

Ledgers are composed of metadata and entries. The metadata is stored in ZooKeeper, which provides a compare-and-swap (CAS) operation. Entries are stored on storage nodes known as bookies.

- -

A ledger has a single writer and multiple readers (SWMR).

- -

Ledger metadata

- -

A ledger’s metadata contains the following:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterNameMeaning
Identifer A 64-bit integer, unique within the system
Ensemble sizeEThe number of nodes the ledger is stored on
Write quorum sizeQwThe number of nodes each entry is written to. In effect, the max replication for the entry.
Ack quorum sizeQaThe number of nodes an entry must be acknowledged on. In effect, the minimum replication for the entry.
Current state The current status of the ledger. One of OPEN, CLOSED, or IN_RECOVERY.
Last entry The last entry in the ledger or NULL is the current state is not CLOSED.
- -

In addition, each ledger’s metadata consists of one or more fragments. Each fragment is either

- -
    -
  • the first entry of a fragment or
  • -
  • a list of bookies for the fragment.
  • -
- -

When creating a ledger, the following invariant must hold:

- -

E >= Qw >= Qa

- -

Thus, the ensemble size (E) must be larger than the write quorum size (Qw), which must in turn be larger than the ack quorum size (Qa). If that condition does not hold, then the ledger creation operation will fail.

- -

Ensembles

- -

When a ledger is created, E bookies are chosen for the entries of that ledger. The bookies are the initial ensemble of the ledger. A ledger can have multiple ensembles, but an entry has only one ensemble. Changes in the ensemble involve a new fragment being added to the ledger.

- -

Take the following example. In this ledger, with ensemble size of 3, there are two fragments and thus two ensembles, one starting at entry 0, the second at entry 12. The second ensemble differs from the first only by its first element. This could be because bookie1 has failed and therefore had to be replaced.

- - - - - - - - - - - - - - - - - - -
First entryBookies
0B1, B2, B3
12B4, B2, B3
- -

Write quorums

- -

Each entry in the log is written to Qw nodes. This is considered the write quorum for that entry. The write quorum is the subsequence of the ensemble, Qw in length, and starting at the bookie at index (entryid % E).

- -

For example, in a ledger of E = 4, Qw, and Qa = 2, with an ensemble consisting of B1, B2, B3, and B4, the write quorums for the first 6 entries will be:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
EntryWrite quorum
0B1, B2, B3
1B2, B3, B4
2B3, B4, B1
3B4, B1, B2
4B1, B2, B3
5B2, B3, B4
- -

There are only E distinct write quorums in any ensemble. If Qw = Qa, then there is only one, as no striping occurs.

- -

Ack quorums

- -

The ack quorum for an entry is any subset of the write quorum of size Qa. If Qa bookies acknowledge an entry, it means it has been fully replicated.

- -

Guarantees

- -

The system can tolerate Qa – 1 failures without data loss.

- -

Bookkeeper guarantees that:

- -
    -
  1. All updates to a ledger will be read in the same order as they were written.
  2. -
  3. All clients will read the same sequence of updates from the ledger.
  4. -
- -

Writing to ledgers

- -

writer, ensuring that entry ids are sequential is trivial. A bookie acknowledges a write once it has been persisted to disk and is therefore durable. Once Qa bookies from the write quorum acknowledge the write, the write is acknowledged to the client, but only if all entries with lower entry ids in the ledger have already been acknowledged to the client.

- -

The entry written contains the ledger id, the entry id, the last add confirmed and the payload. The last add confirmed is the last entry which had been acknowledged to the client when this entry was written. Sending this with the entry speeds up recovery of the ledger in the case that the writer crashes.

- -

Another client can also read entries in the ledger up as far as the last add confirmed, as we guarantee that all entries thus far have been replicated on Qa nodes, and therefore all future readers will be able to also read it. However, to read like this, the ledger should be opened with a non-fencing open. Otherwise, it would kill the writer.

- -

If a node fails to acknowledge a write, the writer will create a new ensemble by replacing the failed node in the current ensemble. It creates a new fragment with this ensemble, starting from the first message that has not been acknowledged to the client. Creating the new fragment involves making a CAS write to the metadata. If the CAS write fails, someone else has modified something in the ledger metadata. This concurrent modification could have been caused by recovery or Closing a ledger as a writer - -

Closing a ledger is straightforward for a writer. The writer makes a CAS write to the metadata, changing the state to CLOSED and setting the last entry of the ledger to the last entry which we have acknowledged to the client.

- -

If the CAS write fails, it means someone else has modified the metadata. We reread the metadata, and retry closing as long as the state of the ledger is still OPEN. If the state is IN_RECOVERY we send an error to the client. If the state is CLOSED and the last entry is the same as the last entry we have acknowledged to the client, we complete the close operation success [...] - -

Closing a ledger as a reader

- -

A reader can also force a ledger to close. Forcing the ledger to close will prevent any writer from adding new entries to the ledger. This is called fencing. This can occur when a writer has crashed or become unavailable, and a new writer wants to take over writing to the log. The new writer must ensure that it has seen all updates from the previous writer, and prevent the previous writer from making any new updates before making any updat [...] - -

To recover a ledger, we first update the state in the metadata to IN_RECOVERY. We then send a fence message to all the bookies in the last fragment of the ledger. When a bookie receives a fence message for a ledger, the fenced state of the ledger is persisted to disk. Once we receive a response from at least (Qw - Qa)+1 bookies from each write quorum in the ensemble, the ledger is fenced.

- -

By ensuring we have received a response from at last (Qw - Qa) + 1 bookies in each write quorum, we ensure that, if the old writer is alive and tries to add a new entry there will be no write quorum in which Qa bookies will accept the write. If the old writer tries to update the ensemble, it will fail on the CAS metadata write, and then see that the ledger is in IN_RECOVERY state, and that it therefore shouldn’t try to write to it.

- -

The old writer will be able to write entries to individual bookies (we can’t guarantee that the fence message reaches all bookies), but as it will not be able reach ack quorum, it will not be able to send a success response to its client. The client will get a LedgerFenced error instead.

- -

It is important to note that when you get a ledger fenced message for an entry, it doesn’t mean that the entry has not been written. It means that the entry may or may not have been written, and this can only be determined after the ledger is recovered. In effect, LedgerFenced should be treated like a timeout.

- -

Once the ledger is fenced, recovery can begin. Recovery means finding the last entry of the ledger and closing the ledger. To find the last entry of the ledger, the client asks all bookies for the highest last add confirmed value they have seen. It waits until it has received a response at least (Qw - Qa) + 1 bookies from each write quorum, and takes the highest response as the entry id to start reading forward from. It then star [...] - -

Ledgers to logs

- -

In BookKeeper, ledgers can be used to build a replicated log for your system. All guarantees provided by BookKeeper are at the ledger level. Guarantees on the whole log can be built using the ledger guarantees and any consistent datastore with a compare-and-swap (CAS) primitive. BookKeeper uses ZooKeeper as the datastore but others could theoretically be used.

- -

A log in BookKeeper is built from some number of ledgers, with a fixed order. A ledger represents a single segment of the log. A ledger could be the whole period that one node was the leader, or there could be multiple ledgers for a single period of leadership. However, there can only ever be one leader that adds entries to a single ledger. Ledgers cannot be reopened for writing once they have been closed/recovered.

- -
-

BookKeeper does not provide leader election. You must use a system like ZooKeeper for this.

-
- -

In many cases, leader election is really leader suggestion. Multiple nodes could think that they are leader at any one time. It is the job of the log to guarantee that only one can write changes to the system.

- -

Opening a log

- -

Once a node thinks it is leader for a particular log, it must take the following steps:

- -
    -
  1. Read the list of ledgers for the log
  2. -
  3. Fence the last two ledgers in the list. Two ledgers are fenced because because the writer may be writing to the second-to-last ledger while adding the last ledger to the list.
  4. -
  5. Create a new ledger
  6. -
  7. Add the new ledger to the ledger list
  8. -
  9. Write the new ledger back to the datastore using a CAS operation
  10. -
- -

The fencing in step 2 and the CAS operation in step 5 prevent two nodes from thinking that they have leadership at any one time.

- -

The CAS operation will fail if the list of ledgers has changed between reading it and writing back the new list. When the CAS operation fails, the leader must start at step 1 again. Even better, they should check that they are in fact still the leader with the system that is providing leader election. The protocol will work correctly without this step, though it will be able to make very little progress if two nodes think they are leader and are duelling for the log.

- -

The node must not serve any writes until step 5 completes successfully.

- -

Rolling ledgers

- -

The leader may wish to close the current ledger and open a new one every so often. Ledgers can only be deleted as a whole. If you don’t roll the log, you won’t be able to clean up old entries in the log without a leader change. By closing the current ledger and adding a new one, the leader allows the log to be truncated whenever that data is no longer needed. The steps for rolling the log is similar to those for creating a new ledger.

- -
    -
  1. Create a new ledger
  2. -
  3. Add the new ledger to the ledger list
  4. -
  5. Write the new ledger list to the datastore using CAS
  6. -
  7. Close the previous ledger
  8. -
- -

By deferring the closing of the previous ledger until step 4, we can continue writing to the log while we perform metadata update operations to add the new ledger. This is safe as long as you fence the last 2 ledgers when acquiring leadership.

- - -
- - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/example/index.html b/content/docs/example/index.html deleted file mode 100644 index da3d6cc..0000000 --- a/content/docs/example/index.html +++ /dev/null @@ -1,511 +0,0 @@ - - - - Apache BookKeeper - Example doc - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - -

Just for experimentation purposes.

-
- -
- -
-
-

ledger

- -
- - -
-
- -
- - -
-

Example doc

-
    -
-
- - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/getting-started/concepts/index.html b/content/docs/getting-started/concepts/index.html deleted file mode 100644 index e38a535..0000000 --- a/content/docs/getting-started/concepts/index.html +++ /dev/null @@ -1,810 +0,0 @@ - - - - Apache BookKeeper - BookKeeper concepts and architecture - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - -

The core components and how they work

-
- -
- -
-
-

BookKeeper is a service that provides persistent storage of streams of log entries—aka records—in sequences called ledgers. BookKeeper replicates stored entries across multiple servers.

- -

Basic terms

- -

In BookKeeper:

- -
    -
  • each unit of a log is an entry (aka record)
  • -
  • streams of log entries are called ledgers
  • -
  • individual servers storing ledgers of entries are called bookies
  • -
- -

BookKeeper is designed to be reliable and resilient to a wide variety of failures. Bookies can crash, corrupt data, or discard data, but as long as there are enough bookies behaving correctly in the ensemble the service as a whole will behave correctly.

- -

Entries

- -
-

Entries contain the actual data written to ledgers, along with some important metadata.

-
- -

BookKeeper entries are sequences of bytes that are written to ledgers. Each entry has the following fields:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FieldJava typeDescription
Ledger numberlongThe ID of the ledger to which the entry has been written
Entry numberlongThe unique ID of the entry
Last confirmed (LC)longThe ID of the last recorded entry
Databyte[]The entry’s data (written by the client application)
Authentication codebyte[]The message auth code, which includes all other fields in the entry
- -

Ledgers

- -
-

Ledgers are the basic unit of storage in BookKeeper.

-
- -

Ledgers are sequences of entries, while each entry is a sequence of bytes. Entries are written to a ledger:

- -
    -
  • sequentially, and
  • -
  • at most once.
  • -
- -

This means that ledgers have append-only semantics. Entries cannot be modified once they’ve been written to a ledger. Determining the proper write order is the responsbility of client applications.

- -

Clients and APIs

- -
-

BookKeeper clients have two main roles: they create and delete ledgers, and they read entries from and write entries to ledgers.

- -

BookKeeper provides both a lower-level and a higher-level API for ledger interaction.

-
- -

There are currently two APIs that can be used for interacting with BookKeeper:

- -
    -
  • The ledger API is a lower-level API that enables you to interact with ledgers directly.
  • -
  • The DistributedLog API is a higher-level API that enables you to use BookKeeper without directly interacting with ledgers.
  • -
- -

In general, you should choose the API based on how much granular control you need over ledger semantics. The two APIs can also both be used within a single application.

- -

Bookies

- -
-

Bookies are individual BookKeeper servers that handle ledgers (more specifically, fragments of ledgers). Bookies function as part of an ensemble.

-
- -

A bookie is an individual BookKeeper storage server. Individual bookies store fragments of ledgers, not entire ledgers (for the sake of performance). For any given ledger L, an ensemble is the group of bookies storing the entries in L.

- -

Whenever entries are written to a ledger, those entries are striped across the ensemble (written to a sub-group of bookies rather than to all bookies).

- -

Motivation

- -
-

BookKeeper was initially inspired by the NameNode server in HDFS but its uses now extend far beyond this.

-
- -

The initial motivation for BookKeeper comes from the Hadoop ecosystem. In the Hadoop Distributed File System (HDFS), a special node called the NameNode logs all operations in a reliable fashion, which ensures that recovery is possible in case of crashes.

- -

The NameNode, however, served only as initial inspiration for BookKeeper. The applications for BookKeeper extend far beyond this and include essentially any application that requires an append-based storage system. BookKeeper provides a number of advantages for such applications:

- -
    -
  • Highly efficient writes
  • -
  • High fault tolerance via replication of messages within ensembles of bookies
  • -
  • High throughput for write operations via striping (across as many bookies as you wish)
  • -
- -

Metadata storage

- -

BookKeeper requires a metadata storage service to store information related to ledgers and available bookies. BookKeeper currently uses ZooKeeper for this and other tasks.

- -

Data management in bookies

- -

Bookies manage data in a log-structured way, which is implemented using three types of files:

- - - -

Journals

- -

A journal file contains BookKeeper transaction logs. Before any update to a ledger takes place, the bookie ensures that a transaction describing the update is written to non-volatile storage. A new journal file is created once the bookie starts or the older journal file reaches the journal file size threshold.

- -

Entry logs

- -

An entry log file manages the written entries received from BookKeeper clients. Entries from different ledgers are aggregated and written sequentially, while their offsets are kept as pointers in a ledger cache for fast lookup.

- -

A new entry log file is created once the bookie starts or the older entry log file reaches the entry log size threshold. Old entry log files are removed by the Garbage Collector Thread once they are not associated with any active ledger.

- -

Index files

- -

An index file is created for each ledger, which comprises a header and several fixed-length index pages that record the offsets of data stored in entry log files.

- -

Since updating index files would introduce random disk I/O index files are updated lazily by a sync thread running in the background. This ensures speedy performance for updates. Before index pages are persisted to disk, they are gathered in a ledger cache for lookup.

- -

Ledger cache

- -

Ledger indexes pages are cached in a memory pool, which allows for more efficient management of disk head scheduling.

- -

Adding entries

- -

When a client instructs a bookie to write an entry to a ledger, the entry will go through the following steps to be persisted on disk:

- -
    -
  1. The entry is appended to an entry log
  2. -
  3. The index of the entry is updated in the ledger cache
  4. -
  5. A transaction corresponding to this entry update is appended to the journal
  6. -
  7. A response is sent to the BookKeeper client
  8. -
- -
-

For performance reasons, the entry log buffers entries in memory and commits them in batches, while the ledger cache holds index pages in memory and flushes them lazily. This process is described in more detail in the Data flush section below.

-
- -

Data flush

- -

Ledger index pages are flushed to index files in the following two cases:

- -
    -
  • The ledger cache memory limit is reached. There is no more space available to hold newer index pages. Dirty index pages will be evicted from the ledger cache and persisted to index files.
  • -
  • A background thread synchronous thread is responsible for flushing index pages from the ledger cache to index files periodically.
  • -
- -

Besides flushing index pages, the sync thread is responsible for rolling journal files in case that journal files use too much disk space. The data flush flow in the sync thread is as follows:

- -
    -
  • A LastLogMark is recorded in memory. The LastLogMark indicates that those entries before it have been persisted (to both index and entry log files) and contains two parts: -
      -
    1. A txnLogId (the file ID of a journal)
    2. -
    3. A txnLogPos (offset in a journal)
    4. -
    -
  • -
  • -

    Dirty index pages are flushed from the ledger cache to the index file, and entry log files are flushed to ensure that all buffered entries in entry log files are persisted to disk.

    - -

    Ideally, a bookie only needs to flush index pages and entry log files that contain entries before LastLogMark. There is, however, no such information in the ledger and entry log mapping to journal files. Consequently, the thread flushes the ledger cache and entry log entirely here, and may flush entries after the LastLogMark. Flushing more is not a problem, though, just redundant.

    -
  • -
  • The LastLogMark is persisted to disk, which means that entries added before LastLogMark whose entry data and index page were also persisted to disk. It is now time to safely remove journal files created earlier than txnLogId.
  • -
- -

If the bookie has crashed before persisting LastLogMark to disk, it still has journal files containing entries for which index pages may not have been persisted. Consequently, when this bookie restarts, it inspects journal files to restore those entries and data isn’t lost.

- -

Using the above data flush mechanism, it is safe for the sync thread to skip data flushing when the bookie shuts down. However, in the entry logger it uses a buffered channel to write entries in batches and there might be data buffered in the buffered channel upon a shut down. The bookie needs to ensure that the entry log flushes its buffered data during shutdown. Otherwise, entry log files become corrupted with partial entries.

- -

Data compaction

- -

On bookies, entries of different ledgers are interleaved in entry log files. A bookie runs a garbage collector thread to delete un-associated entry log files to reclaim disk space. If a given entry log file contains entries from a ledger that has not been deleted, then the entry log file would never be removed and the occupied disk space never reclaimed. In order to avoid such a case, a bookie server compacts entry log files in a garbage collector thread to reclaim disk space.

- -

There are two kinds of compaction running with different frequency: minor compaction and major compaction. The differences between minor compaction and major compaction lies in their threshold value and compaction interval.

- -
    -
  • The garbage collection threshold is the size percentage of an entry log file occupied by those undeleted ledgers. The default minor compaction threshold is 0.2, while the major compaction threshold is 0.8.
  • -
  • The garbage collection interval is how frequently to run the compaction. The default minor compaction interval is 1 hour, while the major compaction threshold is 1 day.
  • -
- -
-

If either the threshold or interval is set to less than or equal to zero, compaction is disabled.

-
- -

The data compaction flow in the garbage collector thread is as follows:

- -
    -
  • The thread scans entry log files to get their entry log metadata, which records a list of ledgers comprising an entry log and their corresponding percentages.
  • -
  • With the normal garbage collection flow, once the bookie determines that a ledger has been deleted, the ledger will be removed from the entry log metadata and the size of the entry log reduced.
  • -
  • If the remaining size of an entry log file reaches a specified threshold, the entries of active ledgers in the entry log will be copied to a new entry log file.
  • -
  • Once all valid entries have been copied, the old entry log file is deleted.
  • -
- -

ZooKeeper metadata

- -

BookKeeper requires a ZooKeeper installation for storing ledger metadata. Whenever you construct a BookKeeper client object, you need to pass a list of ZooKeeper servers as a parameter to the constructor, like this:

- -
String zkConnectionString = "127.0.0.1:2181";
-BookKeeper bkClient = new BookKeeper(zkConnectionString);
-
-
- -
-

For more info on using the BookKeeper Java client, see this guide.

-
- -

Ledger manager

- -

A ledger manager handles ledgers’ metadata (which is stored in ZooKeeper). BookKeeper offers two types of ledger managers: the flat ledger manager and the hierarchical ledger manager. Both ledger managers extend the AbstractZkLedgerManager abstract class.

- -
-

Use the flat ledger manager in most cases

-

The flat ledger manager is the default and is recommended for nearly all use cases. The hierarchical ledger manager is better suited only for managing very large numbers of BookKeeper ledgers (> 50,000).

-
- -

Flat ledger manager

- -

The flat ledger manager, implemented in the FlatLedgerManager class, stores all ledgers’ metadata in child nodes of a single ZooKeeper path. The flat ledger manager creates sequential nodes to ensure the uniqueness of the ledger ID and prefixes all nodes wi [...] - -

The flat ledger manager’s garbage collection follow proceeds as follows:

- -
    -
  • All existing ledgers are fetched from ZooKeeper (zkActiveLedgers)
  • -
  • All ledgers currently active within the bookie are fetched (bkActiveLedgers)
  • -
  • The currently actively ledgers are looped through to determine which ledgers don’t currently exist in ZooKeeper. Those are then garbage collected.
  • -
  • The hierarchical ledger manager stores ledgers’ metadata in two-level znodes.
  • -
- -

Hierarchical ledger manager

- -

The hierarchical ledger manager, implemented in the HierarchicalLedgerManager class, first obtains a global unique ID from ZooKeeper using an EPHEMERAL_SEQUENTIAL znode. Since ZooKeeper’s sequence counter has [...] - -

{level1 (2 digits)}{level2 (4 digits)}{level3 (4 digits)}
-
-
- -

These three parts are used to form the actual ledger node path to store ledger metadata:

- -
{ledgers_root_path}/{level1}/{level2}/L{level3}
-
-
- -

For example, ledger 0000000001 is split into three parts, 00, 0000, and 00001, and stored in znode /{ledgers_root_path}/00/0000/L0001. Each znode could have as many 10,000 ledgers, which avoids the problem of the child list being larger than the maximum ZooKeeper packet size (which is the limitation that initially prompted the creation of the hierarchical ledger manager).

- -
- - - - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/getting-started/installation/index.html b/content/docs/getting-started/installation/index.html deleted file mode 100644 index a04127e..0000000 --- a/content/docs/getting-started/installation/index.html +++ /dev/null @@ -1,650 +0,0 @@ - - - - Apache BookKeeper - BookKeeper installation - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - -

Download or clone BookKeeper and build it locally

-
- -
- -
-
- -

You can install BookKeeper either by downloading a GZipped tarball package or cloning the BookKeeper repository.

- -

Requirements

- - - -

Download

- -

You can download Apache BookKeeper releases from one of many Apache mirrors. Here’s an example for the apache.claz.org mirror:

- -
$ curl -O http://apache.claz.org/bookkeeper/bookkeeper-4.5.0/bookkeeper-4.5.0-src.tar.gz
-$ tar xvf bookkeeper-4.5.0-src.tar.gz
-$ cd bookkeeper-4.5.0
-
-
- -

Clone

- -

To build BookKeeper from source, clone the repository, either from the GitHub mirror or from the Apache repository:

- -
# From the GitHub mirror
-$ git clone https://github.com/apache/bookkeeper
-
-# From Apache directly
-$ git clone git://git.apache.org/bookkeeper.git/
-
-
- -

Build using Maven

- -

Once you have the BookKeeper on your local machine, either by downloading or cloning it, you can then build BookKeeper from source using Maven:

- -
$ mvn package
-
-
- -
-

You can skip tests by adding the -DskipTests flag when running mvn package.

-
- -

Useful Maven commands

- -

Some other useful Maven commands beyond mvn package:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CommandAction
mvn cleanRemoves build artifacts
mvn compileCompiles JAR files from Java sources
mvn compile findbugs:findbugsCompile using the Maven FindBugs plugin
mvn installInstall the BookKeeper JAR locally in your local Maven cache (usually in the ~/.m2 directory)
mvn deployDeploy the BookKeeper JAR to the Maven repo (if you have the proper credentials)
mvn verifyPerforms a wide variety of verification and validation tasks
mvn apache-rat:checkRun Maven using the Apache Rat plugin
mvn compile javadoc:aggregateBuild Javadocs locally
mvn package assembly:singleBuild a complete distribution using the Maven Assembly plugin
- -

Package directory

- -

The BookKeeper project contains several subfolders that you should be aware of:

- - - - - - - - - - - - - - - - - - - - - - - - - - -
SubfolderContains
bookkeeper-serverThe BookKeeper server and client
bookkeeper-benchmarkA benchmarking suite for measuring BookKeeper performance
bookkeeper-statsA BookKeeper stats library
bookkeeper-stats-providersBookKeeper stats providers
- -
- - - - -
-
- -
- - - - - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/getting-started/run-locally/index.html b/content/docs/getting-started/run-locally/index.html deleted file mode 100644 index 0f564fb..0000000 --- a/content/docs/getting-started/run-locally/index.html +++ /dev/null @@ -1,522 +0,0 @@ - - - - Apache BookKeeper - Run bookies locally - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
-

Bookies are individual BookKeeper servers. You can run an ensemble of bookies locally on a single machine using the localbookie command of the bookkeeper CLI tool and specifying the number of bookies you’d like to include in the ensemble.

- -

This would start up an ensemble with 10 bookies:

- -
$ bookeeper-server/bin/bookeeper localbookie 10
-
-
- -
-

When you start up an ensemble using localbookie, all bookies run in a single JVM process.

-
- -
- - - - -
-
- -
- -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/reference/cli/index.html b/content/docs/reference/cli/index.html deleted file mode 100644 index c6afbb0..0000000 --- a/content/docs/reference/cli/index.html +++ /dev/null @@ -1,1516 +0,0 @@ - - - - Apache BookKeeper - BookKeeper CLI tool reference - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - -

A reference guide to the command-line tools that you can use to administer BookKeeper

-
- -
- -
-
- -

bookkeeper

- -

Manages bookies.

- -

Environment variables

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Environment variableDescriptionDefault
BOOKIE_LOG_CONF

The Log4j configuration file.

-
${bookkeeperHome}/bookkeeper-server/conf/log4j.properties
BOOKIE_CONF

The configuration file for the bookie.

-
${bookkeeperHome}/bookkeeper-server/conf/bk_server.conf
BOOKIE_EXTRA_CLASSPATH

Extra paths to add to BookKeeper’s classpath.

-
ENTRY_FORMATTER_CLASS

The entry formatter class used to format entries.

-
BOOKIE_PID_DIR

The directory where the bookie server PID file is stored.

-
BOOKIE_STOP_TIMEOUT

The wait time before forcefully killing the bookie server instance if stopping it is not successful.

-
- -

Commands

- -

bookie

- -

Starts up a bookie.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper bookie 
-
-
-
- -

localbookie

- -

Starts up an ensemble of N bookies in a single JVM process. Typically used for local experimentation and development.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper localbookie N
-
-
-
- -

autorecovery

- -

Runs the autorecovery service daemon.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper autorecovery 
-
-
-
- -

upgrade

- -

Upgrades the bookie’s filesystem.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper upgrade 
-
-
-
- -

shell

- -

Runs the bookie’s shell for admin commands.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell 
-
-
-
- -

help

- -

Displays the help message for the bookkeeper tool.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper help 
-
-
- -

The BookKeeper shell

- -

autorecovery

- -

Enable or disable autorecovery in the cluster.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell autorecovery \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-enableEnable autorecovery of underreplicated ledgers
-disableDisable autorecovery of underreplicated ledgers
-

-
- -

bookieFormat

- -

Format the current server contents.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell bookieFormat \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-nonInteractiveWhether to confirm if old data exists.
-forceIf [nonInteractive] is specified, then whether to force delete the old data without prompt..?
-deleteCookieDelete its cookie on zookeeper
-

-
- -

bookieinfo

- -

Retrieve bookie info such as free and total disk space.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell bookieinfo
-
-
- -

-
- -

bookiesanity

- -

Sanity test for local bookie. Create ledger and write/read entries on the local bookie.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell bookiesanity \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-entries NTotal entries to be added for the test (default 10)
-timeout NTimeout for write/read operations in seconds (default 1)
-

-
- -

decommissionbookie

- -

Force trigger the Audittask and make sure all the ledgers stored in the decommissioning bookie are replicated.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell decommissionbookie
-
-
- -

-
- -

deleteledger

- -

Delete a ledger

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell deleteledger \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-ledgerid NLedger ID
-forceWhether to force delete the Ledger without prompt..?
-

-
- -

expandstorage

- -

Add new empty ledger/index directories. Update the directories info in the conf file before running the command.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell expandstorage
-
-
- -

-
- -

help

- -

Displays the help message.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell help
-
-
- -

-
- -

lastmark

- -

Print last log marker.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell lastmark
-
-
- -

-
- -

ledger

- -

Dump ledger index entries into readable format.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell ledger \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - -
FlagDescription
-m LEDGER_IDPrint meta information
-

-
- -

ledgermetadata

- -

Print the metadata for a ledger.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell ledgermetadata \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - -
FlagDescription
-ledgerid NLedger ID
-

-
- -

listbookies

- -

List the bookies, which are running as either readwrite or readonly mode.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell listbookies \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-readwritePrint readwrite bookies
-readonlyPrint readonly bookies
-hostnamesAlso print hostname of the bookie
-

-
- -

listfilesondisc

- -

List the files in JournalDirectory/LedgerDirectories/IndexDirectories.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell listfilesondisc \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-journalPrint list of journal files
-entrylogPrint list of entryLog files
-indexPrint list of index files
-

-
- -

listledgers

- -

List all ledgers in the cluster (this may take a long time).

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell listledgers \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - -
FlagDescription
-metaPrint metadata
-

-
- -

listunderreplicated

- -

List ledgers marked as underreplicated, with optional options to specify missing replica (BookieId) and to exclude missing replica.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell listunderreplicated \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-missingreplica NBookie Id of missing replica
-excludingmissingreplica NBookie Id of missing replica to ignore
-

-
- -

metaformat

- -

Format Bookkeeper metadata in Zookeeper.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell metaformat \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-nonInteractiveWhether to confirm if old data exists..?
-forceIf [nonInteractive] is specified, then whether to force delete the old data without prompt.
-

-
- -

lostbookierecoverydelay

- -

Setter and Getter for LostBookieRecoveryDelay value (in seconds) in Zookeeper.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell lostbookierecoverydelay \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-getGet LostBookieRecoveryDelay value (in seconds)
-set NSet LostBookieRecoveryDelay value (in seconds)
-

-
- -

readjournal

- -

Scan a journal file and format the entries into readable format.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell readjournal \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-msg JOURNAL_ID|JOURNAL_FILENAMEPrint message body
-dirJournal directory (needed if more than one journal configured)
-

-
- -

readledger

- -

Read a range of entries from a ledger.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell readledger \
-  <ledger_id> [<start_entry_id> [<end_entry_id>]]
-
-
- -

-
- -

readlog

- -

Scan an entry file and format the entries into readable format.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell readlog \
-  <entry_log_id | entry_log_file_name> \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-msgPrint message body
-ledgerid NLedger ID
-entryid NEntry ID
-startpos NStart Position
-endposEnd Position
-

-
- -

recover

- -

Recover the ledger data for failed bookie.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell recover \
-  <bookieSrc> [<bookieDest>] \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - -
FlagDescription
-deleteCookieDelete cookie node for the bookie.
-

-
- -

simpletest

- -

Simple test to create a ledger and write entries to it.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell simpletest \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-ensemble NEnsemble size (default 3)
-writeQuorum NWrite quorum size (default 2)
ackQuorum NAck quorum size (default 2)
-numEntries NEntries to write (default 1000)
-

-
- -

triggeraudit

- -

Force trigger the Audit by resetting the lostBookieRecoveryDelay.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell triggeraudit
-
-
- -

-
- -

updatecookie

- -

Update bookie id in cookie.

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell updatecookie \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - -
FlagDescription
-bookieId <hostname|ip>Bookie Id
-

-
- -

updateledgers

- -

Update bookie id in ledgers (this may take a long time).

- -
Usage
- -
$ bookkeeper-server/bin/bookkeeper shell updateledgers \
-  <options>
-
-
- -
Options
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FlagDescription
-bookieId <hostname|ip>Bookie Id
-updatespersec NNumber of ledgers updating per second (default 5 per sec)
-limit NMaximum number of ledgers to update (default no limit)
-verbosePrint status of the ledger updation (default false)
-printprogress NPrint messages on every configured seconds if verbose turned on (default 10 secs)
-

- -

- - -
- - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/reference/config/index.html b/content/docs/reference/config/index.html deleted file mode 100644 index d0157ee..0000000 --- a/content/docs/reference/config/index.html +++ /dev/null @@ -1,1482 +0,0 @@ - - - - Apache BookKeeper - BookKeeper configuration - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - -

A reference guide to all of BookKeeper's configurable parameters

-
- -
- -
-
-

The table below lists parameters that you can set to configure bookies. All configuration takes place in the bk_server.conf file in the bookkeeper-server/conf directory of your BookKeeper installation.

- -

Server parameters

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
bookiePort

The port that the bookie server listens on.

-
3181
journalDirectories

The directories to which Bookkeeper outputs its write-ahead log (WAL). Could define multi directories to store write head logs, separated by ‘,’. -For example: - journalDirectories=/tmp/bk-journal1,/tmp/bk-journal2 -If journalDirectories is set, bookies will skip journalDirectory and use this setting directory.

-
/tmp/bk-journal
journalDirectory

The directory to which Bookkeeper outputs its write-ahead log (WAL).

-
/tmp/bk-txn
allowMultipleDirsUnderSameDiskPartition

Configure the bookie to allow/disallow multiple ledger/index/journal directories in the same filesystem disk partition

-
indexDirectories

The directories in which index files are stored. If not specified, the value of ledgerDirectories will be used.

-
/tmp/bk-data
minUsableSizeForIndexFileCreation

Minimum safe usable size to be available in index directory for bookie to create index file while replaying journal at the time of bookie start in readonly mode (in bytes)

-
1073741824
listeningInterface

The network interface that the bookie should listen on. If not set, the bookie will listen on all interfaces.

-
eth0
advertisedAddress

Configure a specific hostname or IP address that the bookie should use to advertise itself to -clients. If not set, bookie will advertised its own IP address or hostname, depending on the -listeningInterface and useHostNameAsBookieID settings.

-
eth0
allowLoopback

Whether the bookie is allowed to use a loopback interface as its primary -interface (the interface it uses to establish its identity). By default, loopback interfaces are not allowed as the primary interface.

- -

Using a loopback interface as the primary interface usually indicates a configuration error. It’s fairly common in some VPS setups, for example, to not configure a hostname or to have the hostname resolve to 127.0.0.1. If this is the case, then all bookies in the cluster will establish their identities as 127.0.0.1:3181, and only one will be able to join the cluster. For VPSs configured like this, you should explicitly set the listening interface.

-
false
bookieDeathWatchInterval

Interval to watch whether bookie is dead or not, in milliseconds.

-
1000
flushInterval

How long the interval to flush ledger index pages to disk, in milliseconds. Flushing index files will introduce much random disk I/O. If separating journal dir and ledger dirs each on different devices, flushing would not affect performance. But if putting journal dir and ledger dirs on same device, performance degrade significantly on too frequent flushing. You can consider increment flush interval to get better performance, but you need to pay more time on bookie server re [...] -

100
allowStorageExpansion

Allow the expansion of bookie storage capacity. Newly added ledger and index directories must be empty.

-
false
useHostNameAsBookieID

Whether the bookie should use its hostname to register with the ZooKeeper coordination service. When false, the bookie will use its IP address for the registration.

-
false
allowEphemeralPorts

Whether the bookie is allowed to use an ephemeral port (port 0) as its server port. By default, an ephemeral port is not allowed. Using an ephemeral port as the service port usually indicates a configuration error. However, in unit tests, using an ephemeral port will address port conflict problems and allow running tests in parallel.

-
false
enableLocalTransport

Whether allow the bookie to listen for BookKeeper clients executed on the local JVM.

-
false
disableServerSocketBind

Whether allow the bookie to disable bind on network interfaces, this bookie will be available only to BookKeeper clients executed on the local JVM.

-
false
skipListArenaChunkSize

The number of bytes we should use as chunk allocation for org.apache.bookkeeper.bookie.SkipListArena

-
4194304
skipListArenaMaxAllocSize

The max size we should allocate from the skiplist arena. Allocations larger than this should be allocated directly by the VM to avoid fragmentation.

-
131072
bookieAuthProviderFactoryClass

The bookie authentication provider factory class name. If this is null, no authentication will take place.

-
- -

Garbage collection settings

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
gcWaitTime

How long the interval to trigger next garbage collection, in milliseconds. Since garbage collection is running in background, too frequent gc will heart performance. It is better to give a higher number of gc interval if there is enough disk capacity.

-
1000
gcOverreplicatedLedgerWaitTime

How long the interval to trigger next garbage collection of overreplicated ledgers, in milliseconds. This should not be run very frequently since we read the metadata for all the ledgers on the bookie from zk.

-
86400000
numAddWorkerThreads

The number of threads that handle write requests. if zero, writes are handled by Netty threads directly.

-
1
numReadWorkerThreads

The umber of threads that handle read requests. If 0, reads are handled by Netty threads directly.

-
1
isForceGCAllowWhenNoSpace

Whether force compaction is allowed when the disk is full or almost full. Forcing GC may get some space back, but may also fill up disk space more quickly. This is because new log files are created before GC, while old garbage log files are deleted after GC.

-
false
- -

TSL settings

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
tslProvider

TSL Provider (JDK or OpenSSL)

-
OpenSSL
tslProviderFactoryClass

The path to the class that provides security.

-
org.apache.bookkeeper.security.SSLContextFactory
tslClientAuthentication

Type of security used by server.

-
true
tslKeyStoreType

Bookie Keystore type.

-
JKS
tslKeyStore

Bookie Keystore location (path).

-
tslKeyStore

Bookie Keystore location (path).

-
tslKeyStorePasswordPath

Bookie Keystore password path, if the keystore is protected by a password.

-
tslTrustStoreType

Bookie Truststore type.

-
tslTrustStore

Bookie Truststore location (path).

-
tslTrustStorePasswordPath

Bookie Truststore password path, if the truststore is protected by a password.

-
- -

Long poll request parameter settings

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
numLongPollWorkerThreads

The number of threads that should handle long poll requests.

-
10
requestTimerTickDurationMs

The tick duration in milliseconds for long poll requests.

-
10
requestTimerNumTicks

The number of ticks per wheel for the long poll request timer.

-
1024
- -

AutoRecovery settings

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
auditorPeriodicBookieCheckInterval

The time interval between auditor bookie checks, in seconds. The auditor bookie check checks ledger metadata to see which bookies should contain entries for each ledger. If a bookie that should contain entries is unavailable, then the ledger containing that entry is marked for recovery. Setting this to 0 disables the periodic check. Bookie checks will still run when a bookie fails. The default is once per day.

-
86400
rereplicationEntryBatchSize

The number of entries that a replication will rereplicate in parallel.

-
10
openLedgerRereplicationGracePeriod

The grace period, in seconds, that the replication worker waits before fencing and replicating a ledger fragment that’s still being written to upon bookie failure.

-
30
autoRecoveryDaemonEnabled

Whether the bookie itself can start auto-recovery service also or not.

-
lostBookieRecoveryDelay

How long to wait, in seconds, before starting autorecovery of a lost bookie.

-
0
- -

Netty server settings

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
serverTcpNoDelay

This settings is used to enabled/disabled Nagle’s algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network.

- -

If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance.

-
true
serverSockKeepalive

This setting is used to send keep-alive messages on connection-oriented sockets.

-
true
serverTcpLinger

The socket linger timeout on close. When enabled, a close or shutdown will not return until all queued messages for the socket have been successfully sent or the linger timeout has been reached. Otherwise, the call returns immediately and the closing is done in the background.

-
0
byteBufAllocatorSizeInitial

The Recv ByteBuf allocator initial buf size.

-
65536
byteBufAllocatorSizeMin

The Recv ByteBuf allocator min buf size.

-
65536
byteBufAllocatorSizeMax

The Recv ByteBuf allocator max buf size.

-
1048576
- -

Journal settings

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
journalFormatVersionToWrite

The journal format version to write. -Available formats are 1-5: - 1: no header - 2: a header section was added - 3: ledger key was introduced - 4: fencing key was introduced - 5: expanding header to 512 and padding writes to align sector size configured by journalAlignmentSize

- -

By default, it is 4. If you’d like to enable padding-writes feature, you can set journal version to 5. -You can disable padding-writes by setting journal version back to 4. This feature is available in 4.5.0 and onward versions.

-
4
journalMaxSizeMB

Max file size of journal file, in mega bytes. A new journal file will be created when the old one reaches the file size limitation.

-
2048
journalMaxBackups

Max number of old journal file to kept. Keep a number of old journal files would help data recovery in specia case.

-
5
journalPreAllocSizeMB

How much space should we pre-allocate at a time in the journal.

-
16
journalWriteBufferSizeKB

Size of the write buffers used for the journal.

-
64
journalRemoveFromPageCache

Should we remove pages from page cache after force write

-
false
journalAdaptiveGroupWrites

Should we group journal force writes, which optimize group commit for higher throughput.

-
true
journalMaxGroupWaitMSec

Maximum latency to impose on a journal write to achieve grouping.

-
200
journalBufferedWritesThreshold

Maximum writes to buffer to achieve grouping.

-
524288
journalFlushWhenQueueEmpty

If we should flush the journal when journal queue is empty.

-
false
numJournalCallbackThreads

The number of threads that should handle journal callbacks.

-
1
journalAlignmentSize

All the journal writes and commits should be aligned to given size. If not, zeros will be padded to align to given size.

-
512
journalBufferedEntriesThreshold

Maximum entries to buffer to impose on a journal write to achieve grouping.

-
0
journalFlushWhenQueueEmpty

If we should flush the journal when journal queue is empty.

-
false
- -

Ledger storage settings

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
ledgerStorageClass

Ledger storage implementation class

-
org.apache.bookkeeper.bookie.SortedLedgerStorage
ledgerDirectories

The directory to which Bookkeeper outputs ledger snapshots. You can define multiple directories to store snapshots separated by a comma, for example /tmp/data-dir1,/tmp/data-dir2.

-
/tmp/bk1-data,/tmp/bk2-data
auditorPeriodicCheckInterval

The time interval, in seconds, at which the auditor will check all ledgers in the cluster. By default this runs once a week.

- -

Set this to 0 to disable the periodic check completely. Note that periodic checking will put extra load on the cluster, so it should not be run more frequently than once a day.

-
604800
sortedLedgerStorageEnabled

Whether sorted-ledger storage enabled (default true)

-
true
skipListSizeLimit

The skip list data size limitation (default 64MB) in EntryMemTable

-
67108864L
- -

Ledger cache settings

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
openFileLimit

Max number of ledger index files could be opened in bookie server. If number of ledger index files reaches this limitation, bookie server started to swap some ledgers from memory to disk. Too frequent swap will affect performance. You can tune this number to gain performance according your requirements.

-
900
pageSize

Size of a index page in ledger cache, in bytes. A larger index page can improve performance writing page to disk, which is efficent when you have small number of ledgers and these ledgers have similar number of entries. If you have large number of ledgers and each ledger has fewer entries, smaller index page would improve memory usage.

-
8192
pageLimit

How many index pages provided in ledger cache. If number of index pages reaches this limitation, bookie server starts to swap some ledgers from memory to disk. You can increment this value when you found swap became more frequent. But make sure pageLimit*pageSize should not more than JVM max memory limitation, otherwise you would got OutOfMemoryException. In general, incrementing pageLimit, using smaller index page would gain bettern performance in lager number of ledgers wi [...] -

1
- -

Ledger manager settings

- - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
ledgerManagerType

The ledger manager type, which defines how ledgers are stored, managed, and garbage collected. See the Ledger Manager guide for more details.

-
flat
zkLedgersRootPath

Root Zookeeper path to store ledger metadata. This parameter is used by zookeeper-based ledger manager as a root znode to store all ledgers.

-
/ledgers
- -

Entry log settings

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
logSizeLimit

Max file size of entry logger, in bytes. A new entry log file will be created when the old one reaches the file size limitation.

-
2147483648
entryLogFilePreallocationEnabled

Enable/Disable entry logger preallocation

-
true
flushEntrylogBytes

Entry log flush interval, in bytes. Setting this to 0 or less disables this feature and makes flush happen on log rotation. Flushing in smaller chunks but more frequently reduces spikes in disk I/O. Flushing too frequently may negatively affect performance.

-
0
readBufferSizeBytes

The capacity allocated for BufferedReadChannels, in bytes.

-
512
writeBufferSizeBytes

The number of bytes used as capacity for the write buffer.

-
65536
- -

Entry log compaction settings

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
compactionRate

The rate at which compaction will read entries. The unit is adds per second.

-
1000
minorCompactionThreshold

Threshold of minor compaction. For those entry log files whose remaining size percentage reaches below this threshold will be compacted in a minor compaction. If it is set to less than zero, the minor compaction is disabled.

-
0.2
minorCompactionInterval

Interval to run minor compaction, in seconds. If it is set to less than zero, the minor compaction is disabled.

-
compactionMaxOutstandingRequests

Set the maximum number of entries which can be compacted without flushing. When compacting, the entries are written to the entrylog and the new offsets are cached in memory. Once the entrylog is flushed the index is updated with the new offsets. This parameter controls the number of entries added to the entrylog before a flush is forced. A higher value for this parameter means more memory will be used for offsets. Each offset consists of 3 longs. This parameter should n [...] -

100000
majorCompactionThreshold

Threshold of major compaction. For those entry log files whose remaining size percentage reaches below this threshold will be compacted in a major compaction. Those entry log files whose remaining size percentage is still higher than the threshold will never be compacted. If it is set to less than zero, the minor compaction is disabled.

-
0.8
majorCompactionInterval

Interval to run major compaction, in seconds. If it is set to less than zero, the major compaction is disabled.

-
86400
isThrottleByBytes

Throttle compaction by bytes or by entries.

-
false
compactionRateByEntries

Set the rate at which compaction will read entries. The unit is adds per second.

-
1000
compactionRateByBytes

Set the rate at which compaction will read entries. The unit is bytes added per second.

-
1000000
- -

Statistics

- - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
enableStatistics

Whether statistics are enabled for the bookie.

-
true
statsProviderClass

Stats provider class.

-
org.apache.bookkeeper.stats.CodahaleMetricsProvider
- -

Read-only mode support

- - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
readOnlyModeEnabled

If all ledger directories configured are full, then support only read requests for clients. If “readOnlyModeEnabled=true” then on all ledger disks full, bookie will be converted to read-only mode and serve only read requests. Otherwise the bookie will be shutdown. By default this will be disabled.

-
false
forceReadOnlyBookie

Whether the bookie is force started in read only mode or not.

-
false
- -

Disk utilization

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
diskUsageThreshold

For each ledger dir, maximum disk space which can be used. Default is 0.95f. i.e. 95% of disk can be used at most after which nothing will be written to that partition. If all ledger dir partions are full, then bookie will turn to readonly mode if ‘readOnlyModeEnabled=true’ is set, else it will shutdown. Valid values should be in between 0 and 1 (exclusive).

-
0.95
diskUsageLwmThreshold

Set the disk free space low water mark threshold. Disk is considered full when usage threshold is exceeded. Disk returns back to non-full state when usage is below low water mark threshold. This prevents it from going back and forth between these states frequently when concurrent writes and compaction are happening. This also prevent bookie from switching frequently between read-only and read-writes states in the same cases.

-
0.9
diskUsageWarnThreshold

The disk free space low water mark threshold. Disk is considered full when usage threshold is exceeded. Disk returns back to non-full state when usage is below low water mark threshold. This prevents it from going back and forth between these states frequently when concurrent writes and compaction are happening. This also prevent bookie from switching frequently between read-only and read-writes states in the same cases.

-
0.95
diskCheckInterval

Disk check interval in milliseconds. Interval to check the ledger dirs usage.

-
10000
- -

ZooKeeper parameters

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterDescriptionDefault
zkServers

A list of one of more servers on which Zookeeper is running. The server list can be comma separated values, for example zkServers=zk1:2181,zk2:2181,zk3:2181.

-
localhost:2181
zkTimeout

ZooKeeper client session timeout in milliseconds. Bookie server will exit if it received SESSION_EXPIRED because it was partitioned off from ZooKeeper for more than the session timeout JVM garbage collection, disk I/O will cause SESSION_EXPIRED. Increment this value could help avoiding this issue.

-
10
zkRetryBackoffStartMs

The Zookeeper client backoff retry start time in millis.

-
1000
zkRetryBackoffMaxMs

The Zookeeper client backoff retry max time in millis.

-
10000
zkEnableSecurity

Set ACLs on every node written on ZooKeeper, this way only allowed users will be able to read and write BookKeeper metadata stored on ZooKeeper. In order to make ACLs work you need to setup ZooKeeper JAAS authentication all the bookies and Client need to share the same user, and this is usually done using Kerberos authentication. See ZooKeeper documentation

-
false
-

- -
- - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/reference/metrics/index.html b/content/docs/reference/metrics/index.html deleted file mode 100644 index 8ad79ae..0000000 --- a/content/docs/reference/metrics/index.html +++ /dev/null @@ -1,511 +0,0 @@ - - - - Apache BookKeeper - BookKeeper metrics reference - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - -
-
- -
-
- - - -
- -
- -
-
- - -
- - -
-
- -
- - -
-

BookKeeper metrics reference

-
    -
-
- - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/security/index.html b/content/docs/security/index.html deleted file mode 100644 index 30c48d5..0000000 --- a/content/docs/security/index.html +++ /dev/null @@ -1,588 +0,0 @@ - - - - Apache BookKeeper - BookKeeper Security - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - - -
-
- -
-
- - - -
- -
- -
-
-

In the 4.5.0 release, the BookKeeper community added a number of features that can be used, together or separately, to secure a BookKeeper cluster. -The following security measures are currently supported:

- -
    -
  1. Authentication of connections to bookies from clients, using either TLS or SASL (Kerberos).
  2. -
  3. Authentication of connections from clients, bookies, autorecovery daemons to ZooKeeper, when using zookeeper based ledger managers.
  4. -
  5. Encryption of data transferred between bookies and clients, between bookies and autorecovery daemons using TLS.
  6. -
- -

It’s worth noting that security is optional - non-secured clusters are supported, as well as a mix of authenticated, unauthenticated, encrypted and non-encrypted clients.

- -

NOTE: currently authorization is not yet available in 4.5.0. The Apache BookKeeper community is looking for adding this feature in subsequent releases.

- -

Next Steps

- - - -
- - - - -
-
- -
- - -
-

BookKeeper Security

- -
- - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/security/sasl/index.html b/content/docs/security/sasl/index.html deleted file mode 100644 index 24b278e..0000000 --- a/content/docs/security/sasl/index.html +++ /dev/null @@ -1,799 +0,0 @@ - - - - Apache BookKeeper - Authentication using SASL - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - - -
-
- -
-
- - - -
- -
- -
-
-

Bookies support client authentication via SASL. Currently we only support GSSAPI (Kerberos). We will start -with a general description of how to configure SASL for bookies, clients and autorecovery daemons, followed -by mechanism-specific details and wrap up with some operational details.

- -

SASL configuration for Bookies

- -
    -
  1. Select the mechanisms to enable in the bookies. GSSAPI is the only mechanism currently supported by BookKeeper.
  2. -
  3. Add a JAAS config file for the selected mechanisms as described in the examples for setting up GSSAPI (Kerberos).
  4. -
  5. -

    Pass the JAAS config file location as JVM parameter to each Bookie. For example:

    - -
     -Djava.security.auth.login.config=/etc/bookkeeper/bookie_jaas.conf 
    -
    -
    -
  6. -
  7. -

    Enable SASL auth plugin in bookies, by setting bookieAuthProviderFactoryClass to org.apache.bookkeeper.sasl.SASLBookieAuthProviderFactory.

    - -
     bookieAuthProviderFactoryClass=org.apache.bookkeeper.sasl.SASLBookieAuthProviderFactory
    -
    -
    -
  8. -
  9. -

    If you are running autorecovery along with bookies, then you want to enable SASL auth plugin for autorecovery, by setting - clientAuthProviderFactoryClass to org.apache.bookkeeper.sasl.SASLClientProviderFactory.

    - -
     clientAuthProviderFactoryClass=org.apache.bookkeeper.sasl.SASLClientProviderFactory
    -
    -
    -
  10. -
  11. Follow the steps in GSSAPI (Kerberos) to configure SASL.
  12. -
- -

Important Notes

- -
    -
  1. Bookie is a section name in the JAAS file used by each bookie. This section tells the bookie which principal to use - and the location of the keytab where the principal is stored. It allows the bookie to login using the keytab specified in this section.
  2. -
  3. Auditor is a section name in the JASS file used by autorecovery daemon (it can be co-run with bookies). This section tells the - autorecovery daemon which principal to use and the location of the keytab where the principal is stored. It allows the bookie to - login using the keytab specified in this section.
  4. -
  5. The Client section is used to authenticate a SASL connection with ZooKeeper. It also allows the bookies to set ACLs on ZooKeeper nodes - which locks these nodes down so that only the bookies can modify it. It is necessary to have the same primary name across all bookies. - If you want to use a section name other than Client, set the system property zookeeper.sasl.client to the appropriate name - (e.g -Dzookeeper.sasl.client=ZKClient).
  6. -
  7. ZooKeeper uses zookeeper as the service name by default. If you want to change this, set the system property - zookeeper.sasl.client.username to the appropriate name (e.g. -Dzookeeper.sasl.client.username=zk).
  8. -
- -

SASL configuration for Clients

- -

To configure SASL authentication on the clients:

- -
    -
  1. Select a SASL mechanism for authentication and add a JAAS config file for the selected mechanism as described in the examples for - setting up GSSAPI (Kerberos).
  2. -
  3. -

    Pass the JAAS config file location as JVM parameter to each client JVM. For example:

    - -
     -Djava.security.auth.login.config=/etc/bookkeeper/bookkeeper_jaas.conf 
    -
    -
    -
  4. -
  5. -

    Configure the following properties in bookkeeper ClientConfiguration:

    - -
     clientAuthProviderFactoryClass=org.apache.bookkeeper.sasl.SASLClientProviderFactory
    -
    -
    -
  6. -
- -

Follow the steps in GSSAPI (Kerberos) to configure SASL for the selected mechanism.

- -

Authentication using SASL/Kerberos

- -

Prerequisites

- -

Kerberos

- -

If your organization is already using a Kerberos server (for example, by using Active Directory), there is no need to -install a new server just for BookKeeper. Otherwise you will need to install one, your Linux vendor likely has packages -for Kerberos and a short guide on how to install and configure it (Ubuntu, -Redhat). -Note that if you are using Oracle Java, you will need to download JCE policy files for your Java version and copy them to $JAVA_HOME/jre/lib/security.

- -

Kerberos Principals

- -

If you are using the organization’s Kerberos or Active Directory server, ask your Kerberos administrator for a principal -for each Bookie in your cluster and for every operating system user that will access BookKeeper with Kerberos authentication -(via clients and tools).

- -

If you have installed your own Kerberos, you will need to create these principals yourself using the following commands:

- -
sudo /usr/sbin/kadmin.local -q 'addprinc -randkey bookkeeper/{hostname}@{REALM}'
-sudo /usr/sbin/kadmin.local -q "ktadd -k /etc/security/keytabs/{keytabname}.keytab bookkeeper/{hostname}@{REALM}"
-
-
- -
All hosts must be reachable using hostnames
- -

It is a Kerberos requirement that all your hosts can be resolved with their FQDNs.

- -

Configuring Bookies

- -
    -
  1. -

    Add a suitably modified JAAS file similar to the one below to each Bookie’s config directory, let’s call it bookie_jaas.conf -for this example (note that each bookie should have its own keytab):

    - -
     Bookie {
    -     com.sun.security.auth.module.Krb5LoginModule required
    -     useKeyTab=true
    -     storeKey=true
    -     keyTab="/etc/security/keytabs/bookie.keytab"
    -     principal="bookkeeper/bk1.hostname.com@EXAMPLE.COM";
    - };
    - // ZooKeeper client authentication
    - Client {
    -     com.sun.security.auth.module.Krb5LoginModule required
    -     useKeyTab=true
    -     storeKey=true
    -     keyTab="/etc/security/keytabs/bookie.keytab"
    -     principal="bookkeeper/bk1.hostname.com@EXAMPLE.COM";
    - };
    - // If you are running `autorecovery` along with bookies
    - Auditor {
    -     com.sun.security.auth.module.Krb5LoginModule required
    -     useKeyTab=true
    -     storeKey=true
    -     keyTab="/etc/security/keytabs/bookie.keytab"
    -     principal="bookkeeper/bk1.hostname.com@EXAMPLE.COM";
    - };
    -
    -
    - -

    The Bookie section in the JAAS file tells the bookie which principal to use and the location of the keytab where this principal is stored. - It allows the bookie to login using the keytab specified in this section. See notes for more details on Zookeeper’s SASL configuration.

    -
  2. -
  3. -

    Pass the name of the JAAS file as a JVM parameter to each Bookie:

    - -
     -Djava.security.auth.login.config=/etc/bookkeeper/bookie_jaas.conf
    -
    -
    - -

    You may also wish to specify the path to the krb5.conf file - (see JDK’s Kerberos Requirements for more details):

    - -
     -Djava.security.krb5.conf=/etc/bookkeeper/krb5.conf
    -
    -
    -
  4. -
  5. -

    Make sure the keytabs configured in the JAAS file are readable by the operating system user who is starting the Bookies.

    -
  6. -
  7. -

    Enable SASL authentication plugin in the bookies by setting following parameters.

    - -
     bookieAuthProviderFactoryClass=org.apache.bookkeeper.sasl.SASLBookieAuthProviderFactory
    - # if you run `autorecovery` along with bookies
    - clientAuthProviderFactoryClass=org.apache.bookkeeper.sasl.SASLClientProviderFactory
    -
    -
    -
  8. -
- -

Configuring Clients

- -

To configure SASL authentication on the clients:

- -
    -
  1. -

    Clients will authenticate to the cluster with their own principal (usually with the same name as the user running the client), - so obtain or create these principals as needed. Then create a JAAS file for each principal. The BookKeeper section describes - how the clients like writers and readers can connect to the Bookies. The following is an example configuration for a client using - a keytab (recommended for long-running processes):

    - -
     BookKeeper {
    -     com.sun.security.auth.module.Krb5LoginModule required
    -     useKeyTab=true
    -     storeKey=true
    -     keyTab="/etc/security/keytabs/bookkeeper.keytab"
    -     principal="bookkeeper-client-1@EXAMPLE.COM";
    - };
    -
    -
    -
  2. -
  3. -

    Pass the name of the JAAS file as a JVM parameter to the client JVM:

    - -
     -Djava.security.auth.login.config=/etc/bookkeeper/bookkeeper_jaas.conf
    -
    -
    - -

    You may also wish to specify the path to the krb5.conf file (see - JDK’s Kerberos Requirements for more details).

    - -
     -Djava.security.krb5.conf=/etc/bookkeeper/krb5.conf
    -
    -
    -
  4. -
  5. -

    Make sure the keytabs configured in the bookkeeper_jaas.conf are readable by the operating system user who is starting bookkeeper client.

    -
  6. -
  7. -

    Enable SASL authentication plugin in the client by setting following parameters.

    - -
     clientAuthProviderFactoryClass=org.apache.bookkeeper.sasl.SASLClientProviderFactory
    -
    -
    -
  8. -
- -

Enabling Logging for SASL

- -

To enable SASL debug output, you can set sun.security.krb5.debug system property to true.

- - -
- - - - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/security/tls/index.html b/content/docs/security/tls/index.html deleted file mode 100644 index a1c233a..0000000 --- a/content/docs/security/tls/index.html +++ /dev/null @@ -1,790 +0,0 @@ - - - - Apache BookKeeper - Encryption and Authentication using TLS - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - - -
-
- -
-
- - - -
- -
- -
-
-

Apache BookKeeper allows clients and autorecovery daemons to communicate over TLS, although this is not enabled by default.

- -

Overview

- -

The bookies need their own key and certificate in order to use TLS. Clients can optionally provide a key and a certificate -for mutual authentication. Each bookie or client can also be configured with a truststore, which is used to -determine which certificates (bookie or client identities) to trust (authenticate).

- -

The truststore can be configured in many ways. To understand the truststore, consider the following two examples:

- -
    -
  1. the truststore contains one or many certificates;
  2. -
  3. it contains a certificate authority (CA).
  4. -
- -

In (1), with a list of certificates, the bookie or client will trust any certificate listed in the truststore. -In (2), with a CA, the bookie or client will trust any certificate that was signed by the CA in the truststore.

- -

(TBD: benefits)

- -

Generate TLS key and certificate

- -

The first step of deploying TLS is to generate the key and the certificate for each machine in the cluster. -You can use Java’s keytool utility to accomplish this task. We will generate the key into a temporary keystore -initially so that we can export and sign it later with CA.

- -
keytool -keystore bookie.keystore.jks -alias localhost -validity {validity} -genkey
-
-
- -

You need to specify two parameters in the above command:

- -
    -
  1. keystore: the keystore file that stores the certificate. The keystore file contains the private key of - the certificate; hence, it needs to be kept safely.
  2. -
  3. validity: the valid time of the certificate in days.
  4. -
- -
-Ensure that common name (CN) matches exactly with the fully qualified domain name (FQDN) of the server. -The client compares the CN with the DNS domain name to ensure that it is indeed connecting to the desired server, not a malicious one. -
- -

Creating your own CA

- -

After the first step, each machine in the cluster has a public-private key pair, and a certificate to identify the machine. -The certificate, however, is unsigned, which means that an attacker can create such a certificate to pretend to be any machine.

- -

Therefore, it is important to prevent forged certificates by signing them for each machine in the cluster. -A certificate authority (CA) is responsible for signing certificates. CA works likes a government that issues passports — -the government stamps (signs) each passport so that the passport becomes difficult to forge. Other governments verify the stamps -to ensure the passport is authentic. Similarly, the CA signs the certificates, and the cryptography guarantees that a signed -certificate is computationally difficult to forge. Thus, as long as the CA is a genuine and trusted authority, the clients have -high assurance that they are connecting to the authentic machines.

- -
openssl req -new -x509 -keyout ca-key -out ca-cert -days 365
-
-
- -

The generated CA is simply a public-private key pair and certificate, and it is intended to sign other certificates.

- -

The next step is to add the generated CA to the clients’ truststore so that the clients can trust this CA:

- -
keytool -keystore bookie.truststore.jks -alias CARoot -import -file ca-cert
-
-
- -

NOTE: If you configure the bookies to require client authentication by setting sslClientAuthentication to true on the -bookie config, then you must also provide a truststore for the bookies and it should have all the CA -certificates that clients keys were signed by.

- -
keytool -keystore client.truststore.jks -alias CARoot -import -file ca-cert
-
-
- -

In contrast to the keystore, which stores each machine’s own identity, the truststore of a client stores all the certificates -that the client should trust. Importing a certificate into one’s truststore also means trusting all certificates that are signed -by that certificate. As the analogy above, trusting the government (CA) also means trusting all passports (certificates) that -it has issued. This attribute is called the chain of trust, and it is particularly useful when deploying TLS on a large BookKeeper cluster. -You can sign all certificates in the cluster with a single CA, and have all machines share the same truststore that trusts the CA. -That way all machines can authenticate all other machines.

- -

Signing the certificate

- -

The next step is to sign all certificates in the keystore with the CA we generated. First, you need to export the certificate from the keystore:

- -
keytool -keystore bookie.keystore.jks -alias localhost -certreq -file cert-file
-
-
- -

Then sign it with the CA:

- -
openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file -out cert-signed -days {validity} -CAcreateserial -passin pass:{ca-password}
-
-
- -

Finally, you need to import both the certificate of the CA and the signed certificate into the keystore:

- -
keytool -keystore bookie.keystore.jks -alias CARoot -import -file ca-cert
-keytool -keystore bookie.keystore.jks -alias localhost -import -file cert-signed
-
-
- -

The definitions of the parameters are the following:

- -
    -
  1. keystore: the location of the keystore
  2. -
  3. ca-cert: the certificate of the CA
  4. -
  5. ca-key: the private key of the CA
  6. -
  7. ca-password: the passphrase of the CA
  8. -
  9. cert-file: the exported, unsigned certificate of the bookie
  10. -
  11. cert-signed: the signed certificate of the bookie
  12. -
- -

(TBD: add a script to automatically generate truststores and keystores.)

- -

Configuring Bookies

- -

Bookies support TLS for connections on the same service port. In order to enable TLS, you need to configure tlsProvider to be either -JDK or OpenSSL. If OpenSSL is configured, it will use netty-tcnative-boringssl-static, which loads a corresponding binding according -to the platforms to run bookies.

- -
-

Current OpenSSL implementation doesn’t depend on the system installed OpenSSL library. If you want to leverage the OpenSSL installed on -the system, you can check this example on how to replaces the JARs on the classpath with -netty bindings to leverage installed OpenSSL.

-
- -

The following TLS configs are needed on the bookie side:

- -
tlsProvider=OpenSSL
-# key store
-tlsKeyStoreType=JKS
-tlsKeyStore=/var/private/tls/bookie.keystore.jks
-tlsKeyStorePasswordPath=/var/private/tls/bookie.keystore.passwd
-# trust store
-tlsTrustStoreType=JKS
-tlsTrustStore=/var/private/tls/bookie.truststore.jks
-tlsTrustStorePasswordPath=/var/private/tls/bookie.truststore.passwd
-
-
- -

NOTE: it is important to restrict access to the store files and corresponding password files via filesystem permissions.

- -

Optional settings that are worth considering:

- -
    -
  1. tlsClientAuthentication=false: Enable/Disable using TLS for authentication. This config when enabled will authenticate the other end - of the communication channel. It should be enabled on both bookies and clients for mutual TLS.
  2. -
  3. tlsEnabledCipherSuites= A cipher suite is a named combination of authentication, encryption, MAC and key exchange - algorithm used to negotiate the security settings for a network connection using TLS network protocol. By default, - it is null. OpenSSL Ciphers - JDK Ciphers
  4. -
  5. tlsEnabledProtocols = TLSv1.2,TLSv1.1,TLSv1 (list out the TLS protocols that you are going to accept from clients). - By default, it is not set.
  6. -
- -

To verify the bookie’s keystore and truststore are setup correctly you can run the following command:

- -
openssl s_client -debug -connect localhost:3181 -tls1
-
-
- -

NOTE: TLSv1 should be listed under tlsEnabledProtocols.

- -

In the output of this command you should see the server’s certificate:

- -
-----BEGIN CERTIFICATE-----
-{variable sized random bytes}
------END CERTIFICATE-----
-
-
- -

If the certificate does not show up or if there are any other error messages then your keystore is not setup correctly.

- -

Configuring Clients

- -

TLS is supported only for the new BookKeeper client (BookKeeper versions 4.5.0 and higher), the older clients are not -supported. The configs for TLS will be the same as bookies.

- -

If client authentication is not required by the bookies, the following is a minimal configuration example:

- -
tlsProvider=OpenSSL
-clientTrustStore=/var/private/tls/client.truststore.jks
-clientTrustStorePasswordPath=/var/private/tls/client.truststore.passwd
-
-
- -

If client authentication is required, then a keystore must be created for each client, and the bookies’ truststores must -trust the certificate in the client’s keystore. This may be done using commands that are similar to what we used for -the bookie keystore.

- -

And the following must also be configured:

- -
tlsClientAuthentication=true
-clientKeyStore=/var/private/tls/client.keystore.jks
-clientKeyStorePasswordPath=/var/private/tls/client.keystore.passwd
-
-
- -

NOTE: it is important to restrict access to the store files and corresponding password files via filesystem permissions.

- -

(TBD: add example to use tls in bin/bookkeeper script?)

- -

Enabling TLS Logging

- -

You can enable TLS debug logging at the JVM level by starting the bookies and/or clients with javax.net.debug system property. For example:

- -
-Djavax.net.debug=all
-
-
- -

You can find more details on this in Oracle documentation on -debugging SSL/TLS connections.

- -
- - - - -
-
- - -
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - diff --git a/content/docs/security/zookeeper/index.html b/content/docs/security/zookeeper/index.html deleted file mode 100644 index e2a86f1..0000000 --- a/content/docs/security/zookeeper/index.html +++ /dev/null @@ -1,614 +0,0 @@ - - - - Apache BookKeeper - ZooKeeper Authentication - - - - - - - - - - - - - - - - - -
- - - - -
-
-
-
- - - - -
-
- -
-
- - - -
- -
- -
-
-

New Clusters

- -

To enable ZooKeeper authentication on Bookies or Clients, there are two necessary steps:

- -
    -
  1. Create a JAAS login file and set the appropriate system property to point to it as described in GSSAPI (Kerberos).
  2. -
  3. Set the configuration property zkEnableSecurity in each bookie to true.
  4. -
- -

The metadata stored in ZooKeeper is such that only certain clients will be able to modify and read the corresponding znodes. -The rationale behind this decision is that the data stored in ZooKeeper is not sensitive, but inappropriate manipulation of znodes can cause cluster -disruption.

- -

Migrating Clusters

- -

If you are running a version of BookKeeper that does not support security or simply with security disabled, and you want to make the cluster secure, -then you need to execute the following steps to enable ZooKeeper authentication with minimal disruption to your operations.

- -
    -
  1. Perform a rolling restart setting the JAAS login file, which enables bookie or clients to authenticate. At the end of the rolling restart, - bookies (or clients) are able to manipulate znodes with strict ACLs, but they will not create znodes with those ACLs.
  2. -
  3. Perform a second rolling restart of bookies, this time setting the configuration parameter zkEnableSecurity to true, which enables the use - of secure ACLs when creating znodes.
  4. -
  5. Currently we don’t have provide a tool to set acls on old znodes. You are recommended to set it manually using ZooKeeper tools.
  6. -
- -

It is also possible to turn off authentication in a secured cluster. To do it, follow these steps:

- -
    -
  1. Perform a rolling restart of bookies setting the JAAS login file, which enable bookies to authenticate, but setting zkEnableSecurity to false. - At the end of rolling restart, bookies stop creating znodes with secure ACLs, but are still able to authenticate and manipulate all znodes.
  2. -
  3. You can use ZooKeeper tools to manually reset all ACLs under the znode set in zkLedgersRootPath, which defaults to /ledgers.
  4. -
  5. Perform a second rolling restart of bookies, this time omitting the system property that sets the JAAS login file.
  6. -
- -

Migrating the ZooKeeper ensemble

- -

It is also necessary to enable authentication on the ZooKeeper ensemble. To do it, we need to perform a rolling restart of the ensemble and -set a few properties. Please refer to the ZooKeeper documentation for more details.

- -
    -
  1. Apache ZooKeeper Documentation
  2. -
  3. Apache ZooKeeper Wiki
  4. -
- -
- - - - -
-
- -
- - - - - - -
-
-
- - - -
-

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

- -
- -
-

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

- -
- -
-

A bookie is an individial BookKeeper storage server.

- -

Bookies store the content of ledgers and act as a distributed ensemble.

- -
- -
-

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

- -
- -
-

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

- -

Striping is essential to ensuring fast performance.

- -
- -
-

A journal file stores BookKeeper transaction logs.

- -
- -
-

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

- -
- -
-

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.

- -
- - - - -
- - - - - - - - -- To stop receiving notification emails like this one, please contact ['"commits@bookkeeper.apache.org" '].