hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Demai Ni <nid...@gmail.com>
Subject Re: Setting up NxN replication
Date Sat, 09 Nov 2013 02:55:08 GMT
Ishan,

"Coming to Demai’s suggestion of M-M to 2 instead of 9, i still want to have
the data available from 1 to all clusters. How would I do it with your
setup?".

If I understand the requirement currently, your setup are almost here :
C1 <-> C2 <-> C3 <-> C4  and *C4<->C1*
Basically, a double-linked-list forming a cycle. In this way, no single
point of failure, writes on any of the cluster will eventually be
replicated to all the clusters. The good part is that for each write,
although the total # of the writes are the same as NXN, each cluster will
only need the handle at most 2. With this said, I never setup more than 3
clusters, and have to assume no other bugs similar of HBASE-7709(loop in
Master/Master Replication) coming out of this.

Still, I don't have a good solution for '..a row should be present in only
4/10 clusters..". One approach will use more than one columnfamily, +
either HBase-5002(control replication peer per column family) or
HBase-8751. Unfortunately, neither of the jira has been resolved yet. my 2
cents.

Demai


On Fri, Nov 8, 2013 at 4:38 PM, Ishan Chhabra <ichhabra@rocketfuel.com>wrote:

> Demai, Ted:
>
> Thanks for the detailed answer.
>
> I should add some more context here. The underlying network is a NxN mesh.
> The “cost" for each link is same.
>
> Coming to Demai’s suggestion of M-M to 2 instead of 9, i still want to have
> the data available from 1 to all clusters. How would I do it with your
> setup?
>
> For the difference between MST and NxN:
> Consider the following example, with 4 clusters: C1, C2, C3, C4, and write
> going to C1.
>
> In NxN mesh, the write will be propagated as:
> C1 -> C2
> C1 -> C3
> C1 -> C4
>
> network cost: 3, writes to wal: 3
>
> MST with tree as C1 <-> C2 <-> C3 <-> C4, the write will be propagated
as:
> C1 -> C2
> C2 -> C3
> C3 -> C4
>
> network cost: 3, writes to wal: 3
>
> Both approaches have the same network and wal cost. The only difference is
> that in MST, if C2 fails, writes from C1 will not go to C3 and C4, where as
> in NxN case, the writes will still happen.
>
> Also, (1) and (3) are not an issue for us.
>
> Having said that, I do realize that adding more clusters is increasing the
> load quadratically, and that does worry me. Our actual use case is that a
> row should be present in only 4/10 clusters, but it varies based on the row
> and not on the cluster. So I cannot come up with a static replication
> configuration that will handle that. I am looking into per row replication,
> but will start that a separate discussion and share my ideas there.
>
> I hope this makes more sense now.
>
>
> On Fri, Nov 8, 2013 at 3:47 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > bq. how about your company have a new office in the 11th locations?
> >
> > With minimum spanning tree approach, the increase in load wouldn't be
> > exponential.
> >
> >
> > On Fri, Nov 8, 2013 at 2:58 PM, Demai Ni <nidmgg@gmail.com> wrote:
> >
> > > Ishan,
> > >
> > > have to admit that I am a bit surprise about the need of have data
> center
> > > in 10 different locations. Well, I guess I shouldn't be, as every
> company
> > > is global now(anyone from Mars yet?)
> > >
> > > In your case, since there is only one column family. The headache is
> not
> > as
> > > bad. Let's call your clusters as C1, C2, ... C10
> > >
> > > The safest way for your most critical data is still have setup the M-M
> > > replication by 1 to N-1. That is every cluster add the rest of clusters
> > as
> > > its peer. For example C1 will have C2, C3...C10 as its peers; C2 will
> > have
> > > C1, C3.. C10. Well, that will be a lot of data over the network.
> Although
> > > it is the best/fast way to get all the cluster sync-up. I don't like
> the
> > > idea at all(too expensive for one).
> > >
> > > Now, let's improve it a bit. C1 will setup M-M to 2 of the rest 9, and
> > > carefully planned the distribution so that all the clusters will get
> > equal
> > > load. Well, a system administrator has to do it manually.
> > >
> > > Now, thinking about the headache:
> > > 1) what if your company(that is your manager who has no idea how
> > difficult
> > > it is) decide to have one more column family to be replicated?  how
> about
> > > two more? The load will grow exponentially
> > > 2) how about your company have a new office in the 11th locations?
> again,
> > > grow exponentially
> > > 3) let's say you are the best administrator, and keep nice record of
> > > everything (unforturnatly, Hbase alone doesn't have a good way to
> > maintain
> > > all the record of who is being replicated). And then, the admin left
> the
> > > company? or this is a global company has 10 admin at different
> locations.
> > > How do they communicate of the replication setup?
> > >
> > > :-) Well, the 3) is not too bad. I just like to point it out as it can
> be
> > > quite true for a company large enough to have 10 locations
> > >
> > > Demai
> > >
> > >
> > >
> > >
> > > On Fri, Nov 8, 2013 at 2:42 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> > > >wrote:
> > >
> > > > Ted:
> > > > Yes. It is the same table that is being written to from all
> locations.
> > A
> > > > single row could be updated from multiple locations, but our schema
> is
> > > > designed in a manner that writes will be independent and not clobber
> > each
> > > > other.
> > > >
> > > >
> > > > On Fri, Nov 8, 2013 at 2:33 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > >
> > > > > Ishan:
> > > > > In your use case, the same table is written to in 10 clusters at
> > > roughly
> > > > > the same time ?
> > > > >
> > > > > Please clarify.
> > > > >
> > > > >
> > > > > On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <
> > ichhabra@rocketfuel.com
> > > > > >wrote:
> > > > >
> > > > > > @Demai,
> > > > > > We actually have 10 clusters in different locations.
> > > > > > The replication scope is not an issue for me since I have only
> one
> > > > column
> > > > > > family and we want it replicated to each location.
> > > > > > Can you elaborate more on why a replication setup of more than
> 3-4
> > > > > clusters
> > > > > > would be a headache in your opinion?
> > > > > >
> > > > > >
> > > > > > On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <
> > > ichhabra@rocketfuel.com
> > > > > > >wrote:
> > > > > >
> > > > > > > @Demai,
> > > > > > > Writes from B should also go to A and C. So, if I were
to
> > continue
> > > on
> > > > > > your
> > > > > > > suggestion, I would setup A-B master master and B-C
> > master-master,
> > > > > which
> > > > > > is
> > > > > > > what I was proposing in the 2nd approach (MST based).
> > > > > > >
> > > > > > > @Vladimir
> > > > > > > That is classified. :P
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> > > > > > vladrodionov@gmail.com>wrote:
> > > > > > >
> > > > > > >> *I want to setup NxN replication i.e. N clusters each
> > replicating
> > > to
> > > > > > each
> > > > > > >> other. N is expected to be around 10.*
> > > > > > >>
> > > > > > >> Preparing to thermonuclear war?
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <
> > > > > ichhabra@rocketfuel.com
> > > > > > >> >wrote:
> > > > > > >>
> > > > > > >> > I want to setup NxN replication i.e. N clusters
each
> > replicating
> > > > to
> > > > > > each
> > > > > > >> > other. N is expected to be around 10.
> > > > > > >> >
> > > > > > >> > On doing some research, I realize it is possible
after
> > > HBASE-7709
> > > > > fix,
> > > > > > >> but
> > > > > > >> > it would lead to much more data flowing in the
system. eg.
> > > > > > >> >
> > > > > > >> > Lets say we have 3 clusters: A,B and C.
> > > > > > >> > A new write to A will go to B and then C, and
also go to C
> > > > directly
> > > > > > via
> > > > > > >> the
> > > > > > >> > direct path. This leads to unnecessary network
usage and
> > writes
> > > to
> > > > > WAL
> > > > > > >> of
> > > > > > >> > B, that should be avoided. Now imagine this with
10
> clusters,
> > it
> > > > > won’t
> > > > > > >> > scale.
> > > > > > >> >
> > > > > > >> > One option is to create a minimum spanning tree
joining all
> > the
> > > > > > clusters
> > > > > > >> > and make nodes replicate to their immediate peers
in a
> > > > master-master
> > > > > > >> > fashion. This is much better than NxN mesh, but
still has
> > extra
> > > > > > network
> > > > > > >> and
> > > > > > >> > WAL usage. It also suffers from a failure scenarios
where
> the
> > a
> > > > > single
> > > > > > >> > cluster going down will pause replication to clusters
> > > downstream.
> > > > > > >> >
> > > > > > >> > What I really want is that the ReplicationSource
should only
> > > > forward
> > > > > > >> > WALEdits with cluster-id same as the local cluster-id.
This
> > > seems
> > > > > > like a
> > > > > > >> > straight forward patch to put in.
> > > > > > >> >
> > > > > > >> > Any thoughts on the suggested approach or alternatives?
> > > > > > >> >
> > > > > > >> > --
> > > > > > >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel
Inc.
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > >
> > >
> >
>
>
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message