hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Hadoop Summit EU
Date Tue, 08 Apr 2014 16:34:47 GMT
I'm curious what part of the below communication should restrict it to
private lists?

Anyway, use of dev@hbase for this was fine in my opinion. We try to
discourage discussion on HBase private lists which should really be done on
the dev lists according to the Apache Way.


On Mon, Apr 7, 2014 at 11:36 AM, Enis Söztutar <enis@hortonworks.com> wrote:

> Ops sorry this was intented for internal lists. Apologies for any
> confusion.
>
> Enis
>
> On Monday, April 7, 2014, Enis Söztutar <enis@hortonworks.com> wrote:
>
> > Me and Devaraj attended their talk on their solution for paxos based
> > namenode and HBase replication.
> >
> > They have two solutions, one for single datacenter, and the other multi
> DC
> > geo replication.
> >
> > For the namenode, there is a wrapper, called ConsistencyNode, that
> > basically gets the requests, replicate it via their consistency protocol
> to
> > other CNodes within the DC (paxos based) in the edit log. If the proposal
> > for this is accepted, the changes are made durable. However, from my
> > understanding, on the read side the client chooses only one replica to
> > read. The client decides to connect to one of the replica namenodes,
> which
> > means that it is not doing a paxos read. I think they also wrapped the
> > client, so that if it gets a FileNotFoundException or something similar,
> it
> > will retry on a different server. Also they track the last seen proposal
> id
> > as a transaction id for this as well from my understanding (so
> > read-what-you-write consistency maybe?). The full details of the
> > consistency was not clear to me from the presentation.
> > For their multi-DC replication, they are doing a similar thing, but the
> > data replication is not handled by paxos, only the namenode metadata. For
> > each datacenter, they have a target replication factor (can be set
> > differently for each DC, like 0 because of regulatory reasons). The
> > metadata of NN is replicated via a similar mechanism. The data
> replication
> > is async to the metadata replication though. When a block is finalized,
> the
> > CNode quorum on that particular DC, schedules a remote copy to one of the
> > datacenters. That copy job, copies the block with directly writing the
> > block from the datanode to a remote datanode. Then that remote DC block
> is
> > replicated to the target replication by that DC's CNode quorum. When the
> > target is reached, that DC will create another proposal about the data
> > replication being complete. So the state machine probably contains where
> > each data is replicated, but they were still mentioning the client
> getting
> > DataNotReplicatedException or something.
> >
> > Their work on HBase is still WIP. I do not remember much details on the
> > protocol, except it uses the same replication protocol (their "patented"
> > paxos based replication).
> >
> > Of course the devil is in the details. I did not get that from the
> > presentation.
> >
> > As a side note, Doug when asked, was saying that they are cooking
> > something for backups, so maybe their "secret project" also contains
> > multi-DC consistent state?
> >
> > Enis
> >
> >
> > On Sat, Apr 5, 2014 at 1:55 AM, Ted Yu <yuzhihong@gmail.com
> <javascript:_e(%7B%7D,'cvml','yuzhihong@gmail.com');>
> > > wrote:
> >
> >> Enis:
> >> There was a talk by Konstantin Boudnik<
> http://hadoopsummit.org/amsterdam/speakers/#konstantin-boudnik>
> >> .
> >>
> >> Any interesting material from his presentation ?
> >>
> >> Cheers
> >>
> >
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message