hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Kelly <iv...@yahoo-inc.com>
Subject BookKeeper Journal Manager for Namenode
Date Tue, 15 Nov 2011 16:29:46 GMT
Hi guys,

I've just uploaded a patch to HDFS-234 which contains an implementation of JournalManager
for BookKeeper. The code is ready for review, though I plan to add some more tests. The code
relies on HDFS-1580 which isn't in trunk yet. The code is on github if you want to avoid faffing
about with multiple patches. (https://github.com/ivankelly/hadoop-common/tree/HDFS-234)

To configure the namenode to use BK with this code, put the following in hdfs-site.xml

<property>
  <name>dfs.namenode.edits.dir</name>
  <value>bookkeeper://[zkEnsemble]/[zkPath]</value>
</property>
 
<property>
  <name>dfs.namenode.edits.journalPlugin.bookkeeper</name>
  <value>org.apache.hadoop.hdfs.server.namenode.bkjournal.BookKeeperJournalManager</value>
</property>

Where zkEnsemble is a semicolon[1] separated list of zookeeper servers, and zkPath is the
znode path under which the editlog metadata should be stored. For example, if you have 3 servers,
zk1-3 with zookeeper listening on port 2181, and you want to store the metadata under /hdfsnn,
the URI would be bookkeeper://zk1:2181;zk2:2181;zk3:2181/hdfsnn.

I benchmarks this code against an NFS filer, local storage and a NoPersist implementation
of JournalManager which simply discarded edits to get a theoretical max. I ran the bench using
NNThroughputBenchmark, to create 100000 ops. I've attached the graph generated. The graph
shows that bookkeeper sees similar throughput to NFS and local file (very slightly lower).
Latency is a little higher, but once the disk cache for the local disk saturates, BK's latency
is lower. The NFS filer has a big chunk of NVRAM, so it maintains low latency until the client
saturates. 


Mime
View raw message