hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Söztutar <e...@hortonworks.com>
Subject Re: Hadoop Summit EU
Date Mon, 07 Apr 2014 10:11:56 GMT
Me and Devaraj attended their talk on their solution for paxos based
namenode and HBase replication.

They have two solutions, one for single datacenter, and the other multi DC
geo replication.

For the namenode, there is a wrapper, called ConsistencyNode, that
basically gets the requests, replicate it via their consistency protocol to
other CNodes within the DC (paxos based) in the edit log. If the proposal
for this is accepted, the changes are made durable. However, from my
understanding, on the read side the client chooses only one replica to
read. The client decides to connect to one of the replica namenodes, which
means that it is not doing a paxos read. I think they also wrapped the
client, so that if it gets a FileNotFoundException or something similar, it
will retry on a different server. Also they track the last seen proposal id
as a transaction id for this as well from my understanding (so
read-what-you-write consistency maybe?). The full details of the
consistency was not clear to me from the presentation.
For their multi-DC replication, they are doing a similar thing, but the
data replication is not handled by paxos, only the namenode metadata. For
each datacenter, they have a target replication factor (can be set
differently for each DC, like 0 because of regulatory reasons). The
metadata of NN is replicated via a similar mechanism. The data replication
is async to the metadata replication though. When a block is finalized, the
CNode quorum on that particular DC, schedules a remote copy to one of the
datacenters. That copy job, copies the block with directly writing the
block from the datanode to a remote datanode. Then that remote DC block is
replicated to the target replication by that DC's CNode quorum. When the
target is reached, that DC will create another proposal about the data
replication being complete. So the state machine probably contains where
each data is replicated, but they were still mentioning the client getting
DataNotReplicatedException or something.

Their work on HBase is still WIP. I do not remember much details on the
protocol, except it uses the same replication protocol (their "patented"
paxos based replication).

Of course the devil is in the details. I did not get that from the

As a side note, Doug when asked, was saying that they are cooking something
for backups, so maybe their "secret project" also contains multi-DC
consistent state?


On Sat, Apr 5, 2014 at 1:55 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Enis:
> There was a talk by Konstantin Boudnik<http://hadoopsummit.org/amsterdam/speakers/#konstantin-boudnik>
> .
> Any interesting material from his presentation ?
> Cheers

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message