Return-Path: X-Original-To: apmail-zookeeper-commits-archive@www.apache.org Delivered-To: apmail-zookeeper-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F3F8C927B for ; Tue, 12 Jun 2012 20:48:00 +0000 (UTC) Received: (qmail 54080 invoked by uid 500); 12 Jun 2012 20:48:00 -0000 Delivered-To: apmail-zookeeper-commits-archive@zookeeper.apache.org Received: (qmail 54058 invoked by uid 500); 12 Jun 2012 20:48:00 -0000 Mailing-List: contact commits-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ Delivered-To: mailing list commits@zookeeper.apache.org Received: (qmail 54050 invoked by uid 99); 12 Jun 2012 20:48:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jun 2012 20:48:00 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jun 2012 20:47:56 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id 56E5F23889BB for ; Tue, 12 Jun 2012 20:47:36 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r821505 [1/2] - in /websites/staging/zookeeper/trunk/content: ./ bookkeeper/ bookkeeper/docs/r4.0.0/ bookkeeper/docs/r4.1.0/ bookkeeper/docs/trunk/ Date: Tue, 12 Jun 2012 20:47:33 -0000 To: commits@zookeeper.apache.org From: buildbot@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20120612204736.56E5F23889BB@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: buildbot Date: Tue Jun 12 20:47:31 2012 New Revision: 821505 Log: Staging update by buildbot for zookeeper Modified: websites/staging/zookeeper/trunk/content/ (props changed) websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookieRecovery.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperConfig.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperConfigParams.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperInternals.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperOverview.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperProgrammer.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperStarted.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperStream.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/doc.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigBuild.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigDesign.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigUser.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/index.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/releaseNotes.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookieConfigParams.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookieRecovery.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperConfig.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperConfigParams.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperInternals.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperJMX.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperOverview.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperProgrammer.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperStarted.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperStream.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/doc.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigBuild.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigConsole.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigDesign.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigJMX.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigUser.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/index.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/releaseNotes.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookieConfigParams.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookieRecovery.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperConfig.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperConfigParams.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperInternals.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperJMX.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperOverview.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperProgrammer.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperStarted.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperStream.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/doc.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/hedwigBuild.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/hedwigConsole.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/hedwigDesign.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/hedwigJMX.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/hedwigUser.html websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/index.html websites/staging/zookeeper/trunk/content/bookkeeper/lists.html websites/staging/zookeeper/trunk/content/bylaws.html websites/staging/zookeeper/trunk/content/credits.html websites/staging/zookeeper/trunk/content/lists.html Propchange: websites/staging/zookeeper/trunk/content/ ------------------------------------------------------------------------------ --- cms:source-revision (original) +++ cms:source-revision Tue Jun 12 20:47:31 2012 @@ -1 +1 @@ -1349449 +1349515 Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookieRecovery.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookieRecovery.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookieRecovery.html Tue Jun 12 20:47:31 2012 @@ -115,6 +115,7 @@

Documentation

Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperConfig.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperConfig.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperConfig.html Tue Jun 12 20:47:31 2012 @@ -129,6 +129,7 @@

Documentation

Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperConfigParams.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperConfigParams.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperConfigParams.html Tue Jun 12 20:47:31 2012 @@ -63,7 +63,7 @@

NIO server settings

-
serverTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.
+
serverTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.

Ledger cache settings

@@ -105,6 +105,7 @@

Documentation

Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperInternals.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperInternals.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperInternals.html Tue Jun 12 20:47:31 2012 @@ -78,7 +78,7 @@

You may use following settings to further fine tune the behavior of journalling on bookies:

-
journalMaxSizeMBjournal file size limitation. when a journal reaches this limitation, it will be closed and new journal file be created.
journalMaxBackupshow many old journal files whose id is less than LastLogMark 's journal id.
+
journalMaxSizeMBjournal file size limitation. when a journal reaches this limitation, it will be closed and new journal file be created.
journalMaxBackupshow many old journal files whose id is less than LastLogMark 's journal id.

NOTE: keeping number of old journal files would be useful for manually recovery in special case.

@@ -92,7 +92,7 @@

Flat Ledger Manager

-

All ledgers' metadata are put in a single zookeeper path, created using zookeeper sequential node, which can ensure uniqueness of ledger id. Each ledger node is prefixed with 'L'.

+

All ledgers' metadata are put in a single zookeeper path, created using zookeeper sequential node, which can ensure uniqueness of ledger id. Each ledger node is prefixed with 'L'.

Bookie server manages its owned active ledgers in a hash map. So it is easy for bookie server to find what ledgers are deleted from zookeeper and garbage collect them. And its garbage collection flow is described as below:

@@ -165,6 +165,7 @@

Documentation

Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperOverview.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperOverview.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperOverview.html Tue Jun 12 20:47:31 2012 @@ -109,7 +109,7 @@

p. A simple use of BooKeeper is to implement a write-ahead transaction log. A server maintains an in-memory data structure (with periodic snapshots for example) and logs changes to that structure before it applies the change. The application server creates a ledger at startup and store the ledger id and password in a well known place (ZooKeeper maybe). When it needs to make a change, the server adds an entry with the change information to a ledger and apply the change when BookKeeper adds the entry successfully. The server can even use asyncAddEntry to queue up many changes for high change throughput. BooKeeper meticulously logs the changes in order and call the completion functions in order.

-

When the application server dies, a backup server will come online, get the last snapshot and then it will open the ledger of the old server and read all the entries from the time the snapshot was taken. (Since it doesn't know the last entry number it will use MAX_INTEGER). Once all the entries have been processed, it will close the ledger and start a new one for its use.

+

When the application server dies, a backup server will come online, get the last snapshot and then it will open the ledger of the old server and read all the entries from the time the snapshot was taken. (Since it doesn't know the last entry number it will use MAX_INTEGER). Once all the entries have been processed, it will close the ledger and start a new one for its use.

A client library takes care of communicating with bookies and managing entry numbers. An entry has the following fields:

@@ -158,7 +158,7 @@ p. A simple use of BooKeeper is to imple

If the ledger was closed gracefully, ZooKeeper will have the last entry and everything will work well. But, if the BookKeeper client that was writing the ledger dies, there is some recovery that needs to take place.

-

The problematic entries are the ones at the end of the ledger. There can be entries in flight when a BookKeeper client dies. If the entry only gets to one bookie, the entry should not be readable since the entry will disappear if that bookie fails. If the entry is only on one bookie, that doesn't mean that the entry has not been recorded successfully; the other bookies that recorded the entry might have failed.

+

The problematic entries are the ones at the end of the ledger. There can be entries in flight when a BookKeeper client dies. If the entry only gets to one bookie, the entry should not be readable since the entry will disappear if that bookie fails. If the entry is only on one bookie, that doesn't mean that the entry has not been recorded successfully; the other bookies that recorded the entry might have failed.

The trick to making everything work is to have a correct idea of a last entry. We do it in roughly three steps:

@@ -200,7 +200,7 @@ p. A simple use of BooKeeper is to imple
    -
  • For performance reasons, Entry Log buffers entries in memory and commit them in batches, while Ledger Cache holds index pages in memory and flushes them lazily. We will discuss data flush and how to ensure data integrity in the following section 'Data Flush'.
  • +
  • For performance reasons, Entry Log buffers entries in memory and commit them in batches, while Ledger Cache holds index pages in memory and flushes them lazily. We will discuss data flush and how to ensure data integrity in the following section 'Data Flush'.

Data Flush

@@ -223,7 +223,7 @@ p. A simple use of BooKeeper is to imple
  • Persists LastLogMark to disk, which means entries added before LastLogMark whose entry data and index page were also persisted to disk. It is the time to safely remove journal files created earlier than txnLogId.
      -
    1. If the bookie has crashed before persisting LastLogMark to disk, it still has journal files containing entries for which index pages may not have been persisted. Consequently, when this bookie restarts, it inspects journal files to restore those entries; data isn't lost.
    2. +
    3. If the bookie has crashed before persisting LastLogMark to disk, it still has journal files containing entries for which index pages may not have been persisted. Consequently, when this bookie restarts, it inspects journal files to restore those entries; data isn't lost.
  • @@ -265,6 +265,7 @@ p. A simple use of BooKeeper is to imple

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperProgrammer.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperProgrammer.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperProgrammer.html Tue Jun 12 20:47:31 2012 @@ -203,6 +203,7 @@ while (entries.hasMoreElements()) {

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperStarted.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperStarted.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperStarted.html Tue Jun 12 20:47:31 2012 @@ -59,7 +59,7 @@

    Getting Started: Setting up BookKeeper to write logs.

    -

    This document contains information to get you started quickly with BookKeeper. It is aimed primarily at developers willing to try it out, and contains simple installation instructions for a simple BookKeeper installation and a simple programming example. For further programming detail, please refer to BookKeeper Programmer's Guide.

    +

    This document contains information to get you started quickly with BookKeeper. It is aimed primarily at developers willing to try it out, and contains simple installation instructions for a simple BookKeeper installation and a simple programming example. For further programming detail, please refer to BookKeeper Programmer's Guide.

    Pre-requisites

    @@ -79,7 +79,7 @@

    Setting up bookies

    -

    If you're bold and you want more than just running things locally, then you'll need to run bookies in different servers. You'll need at least three bookies to start with.

    +

    If you're bold and you want more than just running things locally, then you'll need to run bookies in different servers. You'll need at least three bookies to start with.

    For each bookie, we need to execute a command like the following:

    @@ -180,6 +180,7 @@ bkc.close();

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperStream.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperStream.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/bookkeeperStream.html Tue Jun 12 20:47:31 2012 @@ -215,6 +215,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/doc.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/doc.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/doc.html Tue Jun 12 20:47:31 2012 @@ -53,12 +53,12 @@
    -

    In the documentation directory, you'll find:

    +

    In the documentation directory, you'll find:

    • build.txt: Building Hedwig, or how to set up Hedwig
    • -
    • user.txt: User's Guide, or how to program against the Hedwig API and how to run it
    • -
    • dev.txt: Developer's Guide, or Hedwig internals and hacking details
    • +
    • user.txt: User's Guide, or how to program against the Hedwig API and how to run it
    • +
    • dev.txt: Developer's Guide, or Hedwig internals and hacking details

    These documents are all written in the Pandoc dialect of Markdown. This makes them readable as plain text files, but also capable of generating HTML or LaTeX documentation.

    @@ -97,6 +97,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigBuild.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigBuild.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigBuild.html Tue Jun 12 20:47:31 2012 @@ -68,7 +68,7 @@

    From the top level bookkeeper directory, run mvn package. This will compile and package the jars necessary for running hedwig.

    -

    See the User's Guide for instructions on running and usage.

    +

    See the User's Guide for instructions on running and usage.

    Eclipse Instructions

    @@ -76,13 +76,13 @@
    1. Install the Subclipse plugin. Update site: http://subclipse.tigris.org/update_1.4.x.
    2. -
    3. Install the Maven plugin. Update site: http://m2eclipse.sonatype.org/update. From the list of packages available from this site, select everything under the "Maven Integration" category, and from the optional components select the ones with the word "SCM" in them.
    4. -
    5. Go to Preferences > Team > SVN. For the SVN interface, choose "Pure Java".
    6. +
    7. Install the Maven plugin. Update site: http://m2eclipse.sonatype.org/update. From the list of packages available from this site, select everything under the "Maven Integration" category, and from the optional components select the ones with the word "SCM" in them.
    8. +
    9. Go to Preferences > Team > SVN. For the SVN interface, choose "Pure Java".
    10. Choose File > New > Project... > Maven > Checkout Maven Projects from SCM.
    11. For the SCM URL type, choose SVN. For the URL, enter SVN URL. Maven will automatically create a top-level Eclipse project for each of the 4 Maven modules (recommended). If you want fewer top-level projects, uncheck the option of having a project for each module (under Advanced).
    -

    You are now ready to run and debug the client and server code. See the User's Guide for instructions on running and usage.

    +

    You are now ready to run and debug the client and server code. See the User's Guide for instructions on running and usage.

    @@ -116,6 +116,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigDesign.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigDesign.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigDesign.html Tue Jun 12 20:47:31 2012 @@ -63,11 +63,11 @@

    Netty Notes

    -

    The asynchronous network IO infrastructure that Hedwig uses is Netty. Here are some notes on Netty's concurrency architecture and its filter pipeline design.

    +

    The asynchronous network IO infrastructure that Hedwig uses is Netty. Here are some notes on Netty's concurrency architecture and its filter pipeline design.

    Concurrency Architecture

    -

    After calling ServerBootstrap.bind(), Netty starts a boss thread (NioServerSocketPipelineSink.Boss) that just accepts new connections and registers them with one of the workers from the NioWorker pool in round-robin fashion (pool size defaults to CPU count). Each worker runs its own select loop over just the set of keys that have been registered with it. Workers start lazily on demand and run only so long as there are interested fd's/keys. All selected events are handled in the same thread and sent up the pipeline attached to the channel (this association is established by the boss as soon as a new connection is accepted).

    +

    After calling ServerBootstrap.bind(), Netty starts a boss thread (NioServerSocketPipelineSink.Boss) that just accepts new connections and registers them with one of the workers from the NioWorker pool in round-robin fashion (pool size defaults to CPU count). Each worker runs its own select loop over just the set of keys that have been registered with it. Workers start lazily on demand and run only so long as there are interested fd's/keys. All selected events are handled in the same thread and sent up the pipeline attached to the channel (this association is established by the boss as soon as a new connection is accepted).

    All workers, and the boss, run via the executor thread pool; hence, the executor must support at least two simultaneous threads.

    @@ -90,9 +90,9 @@

    ReadAhead Cache

    -

    The delivery manager class is responsible for pushing published messages from the hubs to the subscribers. The most common case is that all subscribers are connected and either caught up, or close to the tail end of the topic. In this case, we don't want the delivery manager to be polling bookkeeper for any newly arrived messages on the topic; new messages should just be pushed to the delivery manager. However, there is also the uncommon case when a subscriber is behind, and messages must be pulled from Bookkeeper.

    +

    The delivery manager class is responsible for pushing published messages from the hubs to the subscribers. The most common case is that all subscribers are connected and either caught up, or close to the tail end of the topic. In this case, we don't want the delivery manager to be polling bookkeeper for any newly arrived messages on the topic; new messages should just be pushed to the delivery manager. However, there is also the uncommon case when a subscriber is behind, and messages must be pulled from Bookkeeper.

    -

    Since all publishes go through the hub, it is possible to cache the recently published messages in the hub, and then the delivery manager won't have to make the trip to bookkeeper to get the messages but instead get them from local process memory.

    +

    Since all publishes go through the hub, it is possible to cache the recently published messages in the hub, and then the delivery manager won't have to make the trip to bookkeeper to get the messages but instead get them from local process memory.

    These ideas of push, pull, and caching are unified in the following way: - A hub has a cache of messages

    @@ -150,6 +150,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigUser.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigUser.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/hedwigUser.html Tue Jun 12 20:47:31 2012 @@ -55,7 +55,7 @@

    Design

    -

    In Hedwig, clients publish messages associated with a topic, and they subscribe to a topic to receive all messages published with that topic. Clients are associated with (publish to and subscribe from) a Hedwig instance (also referred to as a region), which consists of a number of servers called hubs. The hubs partition up topic ownership among themselves, and all publishes and subscribes to a topic must be done to its owning hub. When a client doesn't know the owning hub, it tries a default hub, which may redirect the client.

    +

    In Hedwig, clients publish messages associated with a topic, and they subscribe to a topic to receive all messages published with that topic. Clients are associated with (publish to and subscribe from) a Hedwig instance (also referred to as a region), which consists of a number of servers called hubs. The hubs partition up topic ownership among themselves, and all publishes and subscribes to a topic must be done to its owning hub. When a client doesn't know the owning hub, it tries a default hub, which may redirect the client.

    Running a Hedwig instance requires a Zookeeper server and at least three Bookkeeper servers.

    @@ -65,7 +65,7 @@

    Topics are independent; Hedwig provides no ordering across different topics.

    -

    Version vectors are associated with each topic and serve as the identifiers for each message. Vectors consist of one component per region. A component value is the region's local sequence number on the topic, and is incremented each time a hub persists a message (published either locally or remotely) to BK.

    +

    Version vectors are associated with each topic and serve as the identifiers for each message. Vectors consist of one component per region. A component value is the region's local sequence number on the topic, and is incremented each time a hub persists a message (published either locally or remotely) to BK.

    TODO: More on how version vectors are to be used, and on maintaining vector-maxes.

    @@ -79,7 +79,7 @@

    Limits

    -

    Because the current implementation uses a single socket per subscription, the Hedwig requires a high ulimit on the number of open file descriptors. Non-root users can only use up to the limit specified in /etc/security/limits.conf; to raise this to 1024^2, as root, modify the "nofile" line in /etc/security/limits.conf on all hubs.

    +

    Because the current implementation uses a single socket per subscription, the Hedwig requires a high ulimit on the number of open file descriptors. Non-root users can only use up to the limit specified in /etc/security/limits.conf; to raise this to 1024^2, as root, modify the "nofile" line in /etc/security/limits.conf on all hubs.

    Running Servers

    @@ -136,6 +136,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/index.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/index.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/index.html Tue Jun 12 20:47:31 2012 @@ -63,7 +63,7 @@
    @@ -111,6 +111,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/releaseNotes.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/releaseNotes.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.0.0/releaseNotes.html Tue Jun 12 20:47:31 2012 @@ -258,6 +258,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookieConfigParams.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookieConfigParams.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookieConfigParams.html Tue Jun 12 20:47:31 2012 @@ -63,7 +63,7 @@

    NIO server settings

    -
    serverTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.
    +
    serverTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.

    Ledger cache settings

    @@ -109,6 +109,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookieRecovery.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookieRecovery.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookieRecovery.html Tue Jun 12 20:47:31 2012 @@ -115,6 +115,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperConfig.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperConfig.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperConfig.html Tue Jun 12 20:47:31 2012 @@ -115,7 +115,7 @@

    Missing disks or directories

    -

    Replacing disks or removing directories accidentally can cause a bookie to fail while trying to read a ledger fragment which the ledger metadata has claimed exists on the bookie. For this reason, when a bookie is started for the first time, it's disk configuration is fixed for the lifetime of that bookie. Any change to the disk configuration of the bookie, such as a crashed disk or an accidental configuration change, will result in the bookie being unable to start with the following error:

    +

    Replacing disks or removing directories accidentally can cause a bookie to fail while trying to read a ledger fragment which the ledger metadata has claimed exists on the bookie. For this reason, when a bookie is started for the first time, it's disk configuration is fixed for the lifetime of that bookie. Any change to the disk configuration of the bookie, such as a crashed disk or an accidental configuration change, will result in the bookie being unable to start with the following error:

    2012-05-29 18:19:13,790 - ERROR - [main:BookieServer@314] - Exception running bookie server :
    org.apache.bookkeeper.bookie.BookieException$InvalidCookieException
    @@ -177,6 +177,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperConfigParams.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperConfigParams.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperConfigParams.html Tue Jun 12 20:47:31 2012 @@ -63,7 +63,7 @@

    NIO server settings

    -
    clientTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.
    +
    clientTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.

    Ledger manager settings

    @@ -107,6 +107,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperInternals.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperInternals.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperInternals.html Tue Jun 12 20:47:31 2012 @@ -78,7 +78,7 @@

    You may use following settings to further fine tune the behavior of journalling on bookies:

    -
    journalMaxSizeMBjournal file size limitation. when a journal reaches this limitation, it will be closed and new journal file be created.
    journalMaxBackupshow many old journal files whose id is less than LastLogMark 's journal id.
    +
    journalMaxSizeMBjournal file size limitation. when a journal reaches this limitation, it will be closed and new journal file be created.
    journalMaxBackupshow many old journal files whose id is less than LastLogMark 's journal id.

    NOTE: keeping number of old journal files would be useful for manually recovery in special case.

    @@ -92,7 +92,7 @@

    Flat Ledger Manager

    -

    All ledgers' metadata are put in a single zookeeper path, created using zookeeper sequential node, which can ensure uniqueness of ledger id. Each ledger node is prefixed with 'L'.

    +

    All ledgers' metadata are put in a single zookeeper path, created using zookeeper sequential node, which can ensure uniqueness of ledger id. Each ledger node is prefixed with 'L'.

    Bookie server manages its owned active ledgers in a hash map. So it is easy for bookie server to find what ledgers are deleted from zookeeper and garbage collect them. And its garbage collection flow is described as below:

    @@ -165,6 +165,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperJMX.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperJMX.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperJMX.html Tue Jun 12 20:47:31 2012 @@ -101,6 +101,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperOverview.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperOverview.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperOverview.html Tue Jun 12 20:47:31 2012 @@ -109,7 +109,7 @@

    p. A simple use of BooKeeper is to implement a write-ahead transaction log. A server maintains an in-memory data structure (with periodic snapshots for example) and logs changes to that structure before it applies the change. The application server creates a ledger at startup and store the ledger id and password in a well known place (ZooKeeper maybe). When it needs to make a change, the server adds an entry with the change information to a ledger and apply the change when BookKeeper adds the entry successfully. The server can even use asyncAddEntry to queue up many changes for high change throughput. BooKeeper meticulously logs the changes in order and call the completion functions in order.

    -

    When the application server dies, a backup server will come online, get the last snapshot and then it will open the ledger of the old server and read all the entries from the time the snapshot was taken. (Since it doesn't know the last entry number it will use MAX_INTEGER). Once all the entries have been processed, it will close the ledger and start a new one for its use.

    +

    When the application server dies, a backup server will come online, get the last snapshot and then it will open the ledger of the old server and read all the entries from the time the snapshot was taken. (Since it doesn't know the last entry number it will use MAX_INTEGER). Once all the entries have been processed, it will close the ledger and start a new one for its use.

    A client library takes care of communicating with bookies and managing entry numbers. An entry has the following fields:

    @@ -158,7 +158,7 @@ p. A simple use of BooKeeper is to imple

    If the ledger was closed gracefully, ZooKeeper will have the last entry and everything will work well. But, if the BookKeeper client that was writing the ledger dies, there is some recovery that needs to take place.

    -

    The problematic entries are the ones at the end of the ledger. There can be entries in flight when a BookKeeper client dies. If the entry only gets to one bookie, the entry should not be readable since the entry will disappear if that bookie fails. If the entry is only on one bookie, that doesn't mean that the entry has not been recorded successfully; the other bookies that recorded the entry might have failed.

    +

    The problematic entries are the ones at the end of the ledger. There can be entries in flight when a BookKeeper client dies. If the entry only gets to one bookie, the entry should not be readable since the entry will disappear if that bookie fails. If the entry is only on one bookie, that doesn't mean that the entry has not been recorded successfully; the other bookies that recorded the entry might have failed.

    The trick to making everything work is to have a correct idea of a last entry. We do it in roughly three steps:

    @@ -200,7 +200,7 @@ p. A simple use of BooKeeper is to imple
      -
    • For performance reasons, Entry Log buffers entries in memory and commit them in batches, while Ledger Cache holds index pages in memory and flushes them lazily. We will discuss data flush and how to ensure data integrity in the following section 'Data Flush'.
    • +
    • For performance reasons, Entry Log buffers entries in memory and commit them in batches, while Ledger Cache holds index pages in memory and flushes them lazily. We will discuss data flush and how to ensure data integrity in the following section 'Data Flush'.

    Data Flush

    @@ -223,7 +223,7 @@ p. A simple use of BooKeeper is to imple
  • Persists LastLogMark to disk, which means entries added before LastLogMark whose entry data and index page were also persisted to disk. It is the time to safely remove journal files created earlier than txnLogId.
      -
    1. If the bookie has crashed before persisting LastLogMark to disk, it still has journal files containing entries for which index pages may not have been persisted. Consequently, when this bookie restarts, it inspects journal files to restore those entries; data isn't lost.
    2. +
    3. If the bookie has crashed before persisting LastLogMark to disk, it still has journal files containing entries for which index pages may not have been persisted. Consequently, when this bookie restarts, it inspects journal files to restore those entries; data isn't lost.
  • @@ -287,6 +287,7 @@ p. A simple use of BooKeeper is to imple

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperProgrammer.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperProgrammer.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperProgrammer.html Tue Jun 12 20:47:31 2012 @@ -203,6 +203,7 @@ while (entries.hasMoreElements()) {

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperStarted.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperStarted.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperStarted.html Tue Jun 12 20:47:31 2012 @@ -59,7 +59,7 @@

    Getting Started: Setting up BookKeeper to write logs.

    -

    This document contains information to get you started quickly with BookKeeper. It is aimed primarily at developers willing to try it out, and contains simple installation instructions for a simple BookKeeper installation and a simple programming example. For further programming detail, please refer to BookKeeper Programmer's Guide.

    +

    This document contains information to get you started quickly with BookKeeper. It is aimed primarily at developers willing to try it out, and contains simple installation instructions for a simple BookKeeper installation and a simple programming example. For further programming detail, please refer to BookKeeper Programmer's Guide.

    Pre-requisites

    @@ -79,7 +79,7 @@

    Setting up bookies

    -

    If you're bold and you want more than just running things locally, then you'll need to run bookies in different servers. You'll need at least three bookies to start with.

    +

    If you're bold and you want more than just running things locally, then you'll need to run bookies in different servers. You'll need at least three bookies to start with.

    For each bookie, we need to execute a command like the following:

    @@ -180,6 +180,7 @@ bkc.close();

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperStream.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperStream.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/bookkeeperStream.html Tue Jun 12 20:47:31 2012 @@ -215,6 +215,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/doc.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/doc.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/doc.html Tue Jun 12 20:47:31 2012 @@ -53,12 +53,12 @@
    -

    In the documentation directory, you'll find:

    +

    In the documentation directory, you'll find:

    • build.txt: Building Hedwig, or how to set up Hedwig
    • -
    • user.txt: User's Guide, or how to program against the Hedwig API and how to run it
    • -
    • dev.txt: Developer's Guide, or Hedwig internals and hacking details
    • +
    • user.txt: User's Guide, or how to program against the Hedwig API and how to run it
    • +
    • dev.txt: Developer's Guide, or Hedwig internals and hacking details

    These documents are all written in the Pandoc dialect of Markdown. This makes them readable as plain text files, but also capable of generating HTML or LaTeX documentation.

    @@ -97,6 +97,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigBuild.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigBuild.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigBuild.html Tue Jun 12 20:47:31 2012 @@ -68,7 +68,7 @@

    From the top level bookkeeper directory, run mvn package. This will compile and package the jars necessary for running hedwig.

    -

    See the User's Guide for instructions on running and usage.

    +

    See the User's Guide for instructions on running and usage.

    Eclipse Instructions

    @@ -76,13 +76,13 @@
    1. Install the Subclipse plugin. Update site: http://subclipse.tigris.org/update_1.4.x.
    2. -
    3. Install the Maven plugin. Update site: http://m2eclipse.sonatype.org/update. From the list of packages available from this site, select everything under the "Maven Integration" category, and from the optional components select the ones with the word "SCM" in them.
    4. -
    5. Go to Preferences > Team > SVN. For the SVN interface, choose "Pure Java".
    6. +
    7. Install the Maven plugin. Update site: http://m2eclipse.sonatype.org/update. From the list of packages available from this site, select everything under the "Maven Integration" category, and from the optional components select the ones with the word "SCM" in them.
    8. +
    9. Go to Preferences > Team > SVN. For the SVN interface, choose "Pure Java".
    10. Choose File > New > Project... > Maven > Checkout Maven Projects from SCM.
    11. For the SCM URL type, choose SVN. For the URL, enter SVN URL. Maven will automatically create a top-level Eclipse project for each of the 4 Maven modules (recommended). If you want fewer top-level projects, uncheck the option of having a project for each module (under Advanced).
    -

    You are now ready to run and debug the client and server code. See the User's Guide for instructions on running and usage.

    +

    You are now ready to run and debug the client and server code. See the User's Guide for instructions on running and usage.

    @@ -116,6 +116,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigConsole.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigConsole.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigConsole.html Tue Jun 12 20:47:31 2012 @@ -290,6 +290,7 @@ Finished 0.388 s.

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigDesign.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigDesign.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigDesign.html Tue Jun 12 20:47:31 2012 @@ -63,11 +63,11 @@

    Netty Notes

    -

    The asynchronous network IO infrastructure that Hedwig uses is Netty. Here are some notes on Netty's concurrency architecture and its filter pipeline design.

    +

    The asynchronous network IO infrastructure that Hedwig uses is Netty. Here are some notes on Netty's concurrency architecture and its filter pipeline design.

    Concurrency Architecture

    -

    After calling ServerBootstrap.bind(), Netty starts a boss thread (NioServerSocketPipelineSink.Boss) that just accepts new connections and registers them with one of the workers from the NioWorker pool in round-robin fashion (pool size defaults to CPU count). Each worker runs its own select loop over just the set of keys that have been registered with it. Workers start lazily on demand and run only so long as there are interested fd's/keys. All selected events are handled in the same thread and sent up the pipeline attached to the channel (this association is established by the boss as soon as a new connection is accepted).

    +

    After calling ServerBootstrap.bind(), Netty starts a boss thread (NioServerSocketPipelineSink.Boss) that just accepts new connections and registers them with one of the workers from the NioWorker pool in round-robin fashion (pool size defaults to CPU count). Each worker runs its own select loop over just the set of keys that have been registered with it. Workers start lazily on demand and run only so long as there are interested fd's/keys. All selected events are handled in the same thread and sent up the pipeline attached to the channel (this association is established by the boss as soon as a new connection is accepted).

    All workers, and the boss, run via the executor thread pool; hence, the executor must support at least two simultaneous threads.

    @@ -90,9 +90,9 @@

    ReadAhead Cache

    -

    The delivery manager class is responsible for pushing published messages from the hubs to the subscribers. The most common case is that all subscribers are connected and either caught up, or close to the tail end of the topic. In this case, we don't want the delivery manager to be polling bookkeeper for any newly arrived messages on the topic; new messages should just be pushed to the delivery manager. However, there is also the uncommon case when a subscriber is behind, and messages must be pulled from Bookkeeper.

    +

    The delivery manager class is responsible for pushing published messages from the hubs to the subscribers. The most common case is that all subscribers are connected and either caught up, or close to the tail end of the topic. In this case, we don't want the delivery manager to be polling bookkeeper for any newly arrived messages on the topic; new messages should just be pushed to the delivery manager. However, there is also the uncommon case when a subscriber is behind, and messages must be pulled from Bookkeeper.

    -

    Since all publishes go through the hub, it is possible to cache the recently published messages in the hub, and then the delivery manager won't have to make the trip to bookkeeper to get the messages but instead get them from local process memory.

    +

    Since all publishes go through the hub, it is possible to cache the recently published messages in the hub, and then the delivery manager won't have to make the trip to bookkeeper to get the messages but instead get them from local process memory.

    These ideas of push, pull, and caching are unified in the following way: - A hub has a cache of messages

    @@ -150,6 +150,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigJMX.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigJMX.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigJMX.html Tue Jun 12 20:47:31 2012 @@ -101,6 +101,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigUser.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigUser.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/hedwigUser.html Tue Jun 12 20:47:31 2012 @@ -55,7 +55,7 @@

    Design

    -

    In Hedwig, clients publish messages associated with a topic, and they subscribe to a topic to receive all messages published with that topic. Clients are associated with (publish to and subscribe from) a Hedwig instance (also referred to as a region), which consists of a number of servers called hubs. The hubs partition up topic ownership among themselves, and all publishes and subscribes to a topic must be done to its owning hub. When a client doesn't know the owning hub, it tries a default hub, which may redirect the client.

    +

    In Hedwig, clients publish messages associated with a topic, and they subscribe to a topic to receive all messages published with that topic. Clients are associated with (publish to and subscribe from) a Hedwig instance (also referred to as a region), which consists of a number of servers called hubs. The hubs partition up topic ownership among themselves, and all publishes and subscribes to a topic must be done to its owning hub. When a client doesn't know the owning hub, it tries a default hub, which may redirect the client.

    Running a Hedwig instance requires a Zookeeper server and at least three Bookkeeper servers.

    @@ -65,7 +65,7 @@

    Topics are independent; Hedwig provides no ordering across different topics.

    -

    Version vectors are associated with each topic and serve as the identifiers for each message. Vectors consist of one component per region. A component value is the region's local sequence number on the topic, and is incremented each time a hub persists a message (published either locally or remotely) to BK.

    +

    Version vectors are associated with each topic and serve as the identifiers for each message. Vectors consist of one component per region. A component value is the region's local sequence number on the topic, and is incremented each time a hub persists a message (published either locally or remotely) to BK.

    TODO: More on how version vectors are to be used, and on maintaining vector-maxes.

    @@ -79,7 +79,7 @@

    Limits

    -

    Because the current implementation uses a single socket per subscription, the Hedwig requires a high ulimit on the number of open file descriptors. Non-root users can only use up to the limit specified in /etc/security/limits.conf; to raise this to 1024^2, as root, modify the "nofile" line in /etc/security/limits.conf on all hubs.

    +

    Because the current implementation uses a single socket per subscription, the Hedwig requires a high ulimit on the number of open file descriptors. Non-root users can only use up to the limit specified in /etc/security/limits.conf; to raise this to 1024^2, as root, modify the "nofile" line in /etc/security/limits.conf on all hubs.

    Running Servers

    @@ -136,6 +136,7 @@

    Documentation

    Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/index.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/index.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/index.html Tue Jun 12 20:47:31 2012 @@ -63,7 +63,7 @@
    • Overview
    • Getting started
    • -
    • Programmer's Guide
    • +
    • Programmer's Guide
    • Bookie Server Configuration Parameters
    • BookKeeper Configuration Parameters
    • BookKeeper Internals
    • @@ -82,8 +82,8 @@

      Hedwig Admin & Ops

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/releaseNotes.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/releaseNotes.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/r4.1.0/releaseNotes.html Tue Jun 12 20:47:31 2012 @@ -163,7 +163,7 @@
    • [BOOKKEEPER-194] - Get correct latency for addEntry operations for JMX.
    • -
    • [BOOKKEEPER-195] - HierarchicalLedgerManager doesn't consider idgen as a "specialNode" +
    • [BOOKKEEPER-195] - HierarchicalLedgerManager doesn't consider idgen as a "specialNode"
    • [BOOKKEEPER-197] - HedwigConsole uses the same file to load bookkeeper client config and hub server config
    • @@ -317,6 +317,7 @@

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookieConfigParams.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookieConfigParams.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookieConfigParams.html Tue Jun 12 20:47:31 2012 @@ -63,7 +63,7 @@

      NIO server settings

      -
      serverTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.
      +
      serverTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.

      Ledger cache settings

      @@ -109,6 +109,7 @@

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookieRecovery.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookieRecovery.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookieRecovery.html Tue Jun 12 20:47:31 2012 @@ -115,6 +115,7 @@

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperConfig.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperConfig.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperConfig.html Tue Jun 12 20:47:31 2012 @@ -115,7 +115,7 @@

      Missing disks or directories

      -

      Replacing disks or removing directories accidentally can cause a bookie to fail while trying to read a ledger fragment which the ledger metadata has claimed exists on the bookie. For this reason, when a bookie is started for the first time, it's disk configuration is fixed for the lifetime of that bookie. Any change to the disk configuration of the bookie, such as a crashed disk or an accidental configuration change, will result in the bookie being unable to start with the following error:

      +

      Replacing disks or removing directories accidentally can cause a bookie to fail while trying to read a ledger fragment which the ledger metadata has claimed exists on the bookie. For this reason, when a bookie is started for the first time, it's disk configuration is fixed for the lifetime of that bookie. Any change to the disk configuration of the bookie, such as a crashed disk or an accidental configuration change, will result in the bookie being unable to start with the following error:

      2012-05-29 18:19:13,790 - ERROR - [main:BookieServer@314] - Exception running bookie server :
      org.apache.bookkeeper.bookie.BookieException$InvalidCookieException
      @@ -177,6 +177,7 @@

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperConfigParams.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperConfigParams.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperConfigParams.html Tue Jun 12 20:47:31 2012 @@ -63,7 +63,7 @@

      NIO server settings

      -
      clientTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.
      +
      clientTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.

      Ledger manager settings

      @@ -107,6 +107,7 @@

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperInternals.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperInternals.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperInternals.html Tue Jun 12 20:47:31 2012 @@ -78,7 +78,7 @@

      You may use following settings to further fine tune the behavior of journalling on bookies:

      -
      journalMaxSizeMBjournal file size limitation. when a journal reaches this limitation, it will be closed and new journal file be created.
      journalMaxBackupshow many old journal files whose id is less than LastLogMark 's journal id.
      +
      journalMaxSizeMBjournal file size limitation. when a journal reaches this limitation, it will be closed and new journal file be created.
      journalMaxBackupshow many old journal files whose id is less than LastLogMark 's journal id.

      NOTE: keeping number of old journal files would be useful for manually recovery in special case.

      @@ -92,7 +92,7 @@

      Flat Ledger Manager

      -

      All ledgers' metadata are put in a single zookeeper path, created using zookeeper sequential node, which can ensure uniqueness of ledger id. Each ledger node is prefixed with 'L'.

      +

      All ledgers' metadata are put in a single zookeeper path, created using zookeeper sequential node, which can ensure uniqueness of ledger id. Each ledger node is prefixed with 'L'.

      Bookie server manages its owned active ledgers in a hash map. So it is easy for bookie server to find what ledgers are deleted from zookeeper and garbage collect them. And its garbage collection flow is described as below:

      @@ -165,6 +165,7 @@

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperJMX.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperJMX.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperJMX.html Tue Jun 12 20:47:31 2012 @@ -101,6 +101,7 @@

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperOverview.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperOverview.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperOverview.html Tue Jun 12 20:47:31 2012 @@ -109,7 +109,7 @@

      p. A simple use of BooKeeper is to implement a write-ahead transaction log. A server maintains an in-memory data structure (with periodic snapshots for example) and logs changes to that structure before it applies the change. The application server creates a ledger at startup and store the ledger id and password in a well known place (ZooKeeper maybe). When it needs to make a change, the server adds an entry with the change information to a ledger and apply the change when BookKeeper adds the entry successfully. The server can even use asyncAddEntry to queue up many changes for high change throughput. BooKeeper meticulously logs the changes in order and call the completion functions in order.

      -

      When the application server dies, a backup server will come online, get the last snapshot and then it will open the ledger of the old server and read all the entries from the time the snapshot was taken. (Since it doesn't know the last entry number it will use MAX_INTEGER). Once all the entries have been processed, it will close the ledger and start a new one for its use.

      +

      When the application server dies, a backup server will come online, get the last snapshot and then it will open the ledger of the old server and read all the entries from the time the snapshot was taken. (Since it doesn't know the last entry number it will use MAX_INTEGER). Once all the entries have been processed, it will close the ledger and start a new one for its use.

      A client library takes care of communicating with bookies and managing entry numbers. An entry has the following fields:

      @@ -158,7 +158,7 @@ p. A simple use of BooKeeper is to imple

      If the ledger was closed gracefully, ZooKeeper will have the last entry and everything will work well. But, if the BookKeeper client that was writing the ledger dies, there is some recovery that needs to take place.

      -

      The problematic entries are the ones at the end of the ledger. There can be entries in flight when a BookKeeper client dies. If the entry only gets to one bookie, the entry should not be readable since the entry will disappear if that bookie fails. If the entry is only on one bookie, that doesn't mean that the entry has not been recorded successfully; the other bookies that recorded the entry might have failed.

      +

      The problematic entries are the ones at the end of the ledger. There can be entries in flight when a BookKeeper client dies. If the entry only gets to one bookie, the entry should not be readable since the entry will disappear if that bookie fails. If the entry is only on one bookie, that doesn't mean that the entry has not been recorded successfully; the other bookies that recorded the entry might have failed.

      The trick to making everything work is to have a correct idea of a last entry. We do it in roughly three steps:

      @@ -200,7 +200,7 @@ p. A simple use of BooKeeper is to imple
        -
      • For performance reasons, Entry Log buffers entries in memory and commit them in batches, while Ledger Cache holds index pages in memory and flushes them lazily. We will discuss data flush and how to ensure data integrity in the following section 'Data Flush'.
      • +
      • For performance reasons, Entry Log buffers entries in memory and commit them in batches, while Ledger Cache holds index pages in memory and flushes them lazily. We will discuss data flush and how to ensure data integrity in the following section 'Data Flush'.

      Data Flush

      @@ -223,7 +223,7 @@ p. A simple use of BooKeeper is to imple
    • Persists LastLogMark to disk, which means entries added before LastLogMark whose entry data and index page were also persisted to disk. It is the time to safely remove journal files created earlier than txnLogId.
        -
      1. If the bookie has crashed before persisting LastLogMark to disk, it still has journal files containing entries for which index pages may not have been persisted. Consequently, when this bookie restarts, it inspects journal files to restore those entries; data isn't lost.
      2. +
      3. If the bookie has crashed before persisting LastLogMark to disk, it still has journal files containing entries for which index pages may not have been persisted. Consequently, when this bookie restarts, it inspects journal files to restore those entries; data isn't lost.
    • @@ -287,6 +287,7 @@ p. A simple use of BooKeeper is to imple

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperProgrammer.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperProgrammer.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperProgrammer.html Tue Jun 12 20:47:31 2012 @@ -203,6 +203,7 @@ while (entries.hasMoreElements()) {

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperStarted.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperStarted.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperStarted.html Tue Jun 12 20:47:31 2012 @@ -59,7 +59,7 @@

      Getting Started: Setting up BookKeeper to write logs.

      -

      This document contains information to get you started quickly with BookKeeper. It is aimed primarily at developers willing to try it out, and contains simple installation instructions for a simple BookKeeper installation and a simple programming example. For further programming detail, please refer to BookKeeper Programmer's Guide.

      +

      This document contains information to get you started quickly with BookKeeper. It is aimed primarily at developers willing to try it out, and contains simple installation instructions for a simple BookKeeper installation and a simple programming example. For further programming detail, please refer to BookKeeper Programmer's Guide.

      Pre-requisites

      @@ -79,7 +79,7 @@

      Setting up bookies

      -

      If you're bold and you want more than just running things locally, then you'll need to run bookies in different servers. You'll need at least three bookies to start with.

      +

      If you're bold and you want more than just running things locally, then you'll need to run bookies in different servers. You'll need at least three bookies to start with.

      For each bookie, we need to execute a command like the following:

      @@ -180,6 +180,7 @@ bkc.close();

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperStream.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperStream.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/bookkeeperStream.html Tue Jun 12 20:47:31 2012 @@ -215,6 +215,7 @@

      Documentation

      Modified: websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/doc.html ============================================================================== --- websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/doc.html (original) +++ websites/staging/zookeeper/trunk/content/bookkeeper/docs/trunk/doc.html Tue Jun 12 20:47:31 2012 @@ -53,12 +53,12 @@
      -

      In the documentation directory, you'll find:

      +

      In the documentation directory, you'll find:

      • build.txt: Building Hedwig, or how to set up Hedwig
      • -
      • user.txt: User's Guide, or how to program against the Hedwig API and how to run it
      • -
      • dev.txt: Developer's Guide, or Hedwig internals and hacking details
      • +
      • user.txt: User's Guide, or how to program against the Hedwig API and how to run it
      • +
      • dev.txt: Developer's Guide, or Hedwig internals and hacking details

      These documents are all written in the Pandoc dialect of Markdown. This makes them readable as plain text files, but also capable of generating HTML or LaTeX documentation.

      @@ -97,6 +97,7 @@

      Documentation