hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Setting up NxN replication
Date Fri, 08 Nov 2013 23:47:09 GMT
bq. how about your company have a new office in the 11th locations?

With minimum spanning tree approach, the increase in load wouldn't be
exponential.


On Fri, Nov 8, 2013 at 2:58 PM, Demai Ni <nidmgg@gmail.com> wrote:

> Ishan,
>
> have to admit that I am a bit surprise about the need of have data center
> in 10 different locations. Well, I guess I shouldn't be, as every company
> is global now(anyone from Mars yet?)
>
> In your case, since there is only one column family. The headache is not as
> bad. Let's call your clusters as C1, C2, ... C10
>
> The safest way for your most critical data is still have setup the M-M
> replication by 1 to N-1. That is every cluster add the rest of clusters as
> its peer. For example C1 will have C2, C3...C10 as its peers; C2 will have
> C1, C3.. C10. Well, that will be a lot of data over the network. Although
> it is the best/fast way to get all the cluster sync-up. I don't like the
> idea at all(too expensive for one).
>
> Now, let's improve it a bit. C1 will setup M-M to 2 of the rest 9, and
> carefully planned the distribution so that all the clusters will get equal
> load. Well, a system administrator has to do it manually.
>
> Now, thinking about the headache:
> 1) what if your company(that is your manager who has no idea how difficult
> it is) decide to have one more column family to be replicated?  how about
> two more? The load will grow exponentially
> 2) how about your company have a new office in the 11th locations? again,
> grow exponentially
> 3) let's say you are the best administrator, and keep nice record of
> everything (unforturnatly, Hbase alone doesn't have a good way to maintain
> all the record of who is being replicated). And then, the admin left the
> company? or this is a global company has 10 admin at different locations.
> How do they communicate of the replication setup?
>
> :-) Well, the 3) is not too bad. I just like to point it out as it can be
> quite true for a company large enough to have 10 locations
>
> Demai
>
>
>
>
> On Fri, Nov 8, 2013 at 2:42 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >wrote:
>
> > Ted:
> > Yes. It is the same table that is being written to from all locations. A
> > single row could be updated from multiple locations, but our schema is
> > designed in a manner that writes will be independent and not clobber each
> > other.
> >
> >
> > On Fri, Nov 8, 2013 at 2:33 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > Ishan:
> > > In your use case, the same table is written to in 10 clusters at
> roughly
> > > the same time ?
> > >
> > > Please clarify.
> > >
> > >
> > > On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> > > >wrote:
> > >
> > > > @Demai,
> > > > We actually have 10 clusters in different locations.
> > > > The replication scope is not an issue for me since I have only one
> > column
> > > > family and we want it replicated to each location.
> > > > Can you elaborate more on why a replication setup of more than 3-4
> > > clusters
> > > > would be a headache in your opinion?
> > > >
> > > >
> > > > On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <
> ichhabra@rocketfuel.com
> > > > >wrote:
> > > >
> > > > > @Demai,
> > > > > Writes from B should also go to A and C. So, if I were to continue
> on
> > > > your
> > > > > suggestion, I would setup A-B master master and B-C master-master,
> > > which
> > > > is
> > > > > what I was proposing in the 2nd approach (MST based).
> > > > >
> > > > > @Vladimir
> > > > > That is classified. :P
> > > > >
> > > > >
> > > > > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> > > > vladrodionov@gmail.com>wrote:
> > > > >
> > > > >> *I want to setup NxN replication i.e. N clusters each replicating
> to
> > > > each
> > > > >> other. N is expected to be around 10.*
> > > > >>
> > > > >> Preparing to thermonuclear war?
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <
> > > ichhabra@rocketfuel.com
> > > > >> >wrote:
> > > > >>
> > > > >> > I want to setup NxN replication i.e. N clusters each replicating
> > to
> > > > each
> > > > >> > other. N is expected to be around 10.
> > > > >> >
> > > > >> > On doing some research, I realize it is possible after
> HBASE-7709
> > > fix,
> > > > >> but
> > > > >> > it would lead to much more data flowing in the system. eg.
> > > > >> >
> > > > >> > Lets say we have 3 clusters: A,B and C.
> > > > >> > A new write to A will go to B and then C, and also go to
C
> > directly
> > > > via
> > > > >> the
> > > > >> > direct path. This leads to unnecessary network usage and
writes
> to
> > > WAL
> > > > >> of
> > > > >> > B, that should be avoided. Now imagine this with 10 clusters,
it
> > > won’t
> > > > >> > scale.
> > > > >> >
> > > > >> > One option is to create a minimum spanning tree joining
all the
> > > > clusters
> > > > >> > and make nodes replicate to their immediate peers in a
> > master-master
> > > > >> > fashion. This is much better than NxN mesh, but still has
extra
> > > > network
> > > > >> and
> > > > >> > WAL usage. It also suffers from a failure scenarios where
the a
> > > single
> > > > >> > cluster going down will pause replication to clusters
> downstream.
> > > > >> >
> > > > >> > What I really want is that the ReplicationSource should
only
> > forward
> > > > >> > WALEdits with cluster-id same as the local cluster-id. This
> seems
> > > > like a
> > > > >> > straight forward patch to put in.
> > > > >> >
> > > > >> > Any thoughts on the suggested approach or alternatives?
> > > > >> >
> > > > >> > --
> > > > >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > >
> > >
> >
> >
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message