falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Yu <johnyu0...@gmail.com>
Subject Re: On configuring two source clusters due to colo requirement
Date Tue, 15 Jul 2014 19:30:51 GMT
Hey Satish,

Thanks for your reply!

I can see how setting up that way would definitely work.
Also, it is probably technically more correct as well, as data generated by
different processes should be considered different.

However, we are thinking along the lines of data discovery, in which a
critical dataset might be computed on different colos simultaneously for
both DR and load balancing purposes.  In this scenario, we would somehow
like the end users to know that feed1 and feed2 are logically the same
data, and they are free to pick one to use.

Just wondering whether it make sense to support multiple sources and
multiple targets without specifying partition (and maybe the target cluster
have to specify the order of sources from which to copy).  Also I am
guessing that this "multiple sources and multiple targets without
specifying partition" requirement must have came up before, and what was
the thought process that went behind not supporting it in the end.

Thanks a lot!

2014-07-14 21:34 GMT-07:00 Satish Mittal <satish.mittal@inmobi.com>:

> Hi,
> Given that both ETL clusters are producing the same data-set independent of
> each other and the aim is to replicate the data-set within colo (to avoid
> any cross-colo data movement), you could simply have 2 instances of the
> same feed, one per colo:
> feed1:
> <cluster name=“colo1ETL type="source">
> <cluster name=“colo1A” type="target">
> feed2:
> <cluster name=“colo2ETL type="source">
> <cluster name=“colo2A” type="target">
> The 1st error was coming since multiple source replication was configured
> (which needs partition expressions to be specified). Also that
> configuration would have ended up moving data across colos, which is
> against your desired goal.
> Thanks,
> Satish
> On Mon, Jul 14, 2014 at 11:52 PM, John Yu <johnyu0520@gmail.com> wrote:
> > Hey all,
> >
> > We currently have the following use case:
> > Colo1 has 1 ETL cluster (Colo1-ETL) and 1 adhoc cluster (Colo1-A)
> > Colo2 has 1 ETL cluster (Colo2-ETL) and 1 adhoc cluster (Colo2-A)
> >
> > Due to the bandwidth constraint between the two colo's, we are thinking
> of
> > having the 2 ETL clusters perform the same computation to generate the
> same
> > dataset, and have the 2 adhoc clusters pull from their respective
> > colo-local ETL cluster.
> >
> > What would be a good way to configure this feed?
> >
> > I've tried the following:
> > <cluster name=“colo1ETL type="source">
> > <cluster name="colo2ETL" type="source">
> > <cluster name=“colo1A” type="target">
> > <cluster name="colo2A” type="target">
> > Error: Partition expression has to be specified for cluster colo1ETL as
> > there are more than one source clusters
> >
> > <cluster name=“colo1ETL”>
> > <cluster name="colo2ETL”>
> > <cluster name=“colo1A” type="target">
> > <cluster name="colo2A” type="target">
> > Error: Feed: pve-intermediate should have atleast one source cluster
> > defined
> >
> >
> > Thanks!
> >
> > John
> >
> --
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.

余守中  John Yu (Yu, Shoou-Jong)
Mobile: 650-691-3314

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message