hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "BookieRecoveryPage" by FlavioJunqueira
Date Fri, 02 Oct 2009 11:35:27 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "BookieRecoveryPage" page has been changed by FlavioJunqueira:

  = Bookie Recovery Design =
+ == Problem statement and trade-offs ==
  The essential idea of the bookie recovery feature is to enable an application to heal its
bookie ensemble once some bookie has crashed. The bookie recovery task is basically the one
of reconstructing the ledger fragment that the crashed stored or should have stored, had it
not crashed.
- == Requirements ==
+ By design, a bookie can store fragments of multiple ledgers. To recover a bookie, we hence
need to create new copies of each of the fragments that were present in the faulty bookie.
There two choices: we recover ledgers individually or we recover one ledger at a time. To
decide which one is more appropriate, we have to think about how we will use such a recovery
tool. If applications are to run such a tool, then it is probably best to recover one at a
time or at small batches. If some operations team will perform recovery on behalf of applications,
then they will probably prefer to recover the whole set of faulty bookie.
- == Design ==
+ Such a recovery tool can run either as a separate client or directly in a bookie. The advantage
of implementing recovery on the client side is simplicity: we can just leverage the client
implementation to read entries and write to the new bookie. Performing such a task in a client,
however, may lead to an inefficient utilization of network bandwidth. For an efficient utilization
of network bandwidth, it is best to copy entries directly. 
+ == Design choices ==

View raw message