Return-Path: X-Original-To: apmail-falcon-dev-archive@minotaur.apache.org Delivered-To: apmail-falcon-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 482D41175C for ; Tue, 15 Jul 2014 19:31:49 +0000 (UTC) Received: (qmail 41236 invoked by uid 500); 15 Jul 2014 19:31:49 -0000 Delivered-To: apmail-falcon-dev-archive@falcon.apache.org Received: (qmail 41195 invoked by uid 500); 15 Jul 2014 19:31:49 -0000 Mailing-List: contact dev-help@falcon.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@falcon.incubator.apache.org Delivered-To: mailing list dev@falcon.incubator.apache.org Received: (qmail 41179 invoked by uid 99); 15 Jul 2014 19:31:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Jul 2014 19:31:48 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of johnyu0520@gmail.com designates 209.85.160.180 as permitted sender) Received: from [209.85.160.180] (HELO mail-yk0-f180.google.com) (209.85.160.180) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Jul 2014 19:31:46 +0000 Received: by mail-yk0-f180.google.com with SMTP id 200so1058917ykr.11 for ; Tue, 15 Jul 2014 12:31:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=OpdaadPtGoFaGkxvAbIfZuZBVLrai2Fl5DfNsNkI6w0=; b=St7KZzvOD+x4tCG+Q7TREloWDGITAPQIFGK+uzFSTsINsXZEMvHw8AIKsIIuztpZgV OWufnMjiLe/HO34+HKsm3IQedgrVMYvrTiGjm+Su+4au4h1RHDS3OZTfwv3HzsSU/fpC LyWZC1Z2smRQp78cxBPf1Nhe3EpdCCwWI5lhpnOh9zKLQskT8EV+Ot+CR164nUnfq+0q IxTnM2hyaJH0qTdapWxkTpYWhCMlP4eL8JMOE4Gn8MwDw8mWTaG/BZBBxB02X9V9d5cU rdNa1xfQJWiRpo12nwfyGJ+wdjSNl/eZI7pQUQ55BOyQ50x94gzCcE51asgq/y+rkGqG CvCA== X-Received: by 10.236.103.135 with SMTP id f7mr43261616yhg.102.1405452681705; Tue, 15 Jul 2014 12:31:21 -0700 (PDT) MIME-Version: 1.0 Received: by 10.170.115.20 with HTTP; Tue, 15 Jul 2014 12:30:51 -0700 (PDT) In-Reply-To: References: From: John Yu Date: Tue, 15 Jul 2014 12:30:51 -0700 Message-ID: Subject: Re: On configuring two source clusters due to colo requirement To: dev@falcon.incubator.apache.org Cc: Venkat R , Seetharam Venkatesh Content-Type: multipart/alternative; boundary=001a11332a2ed77e3a04fe407337 X-Virus-Checked: Checked by ClamAV on apache.org --001a11332a2ed77e3a04fe407337 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hey Satish, Thanks for your reply! I can see how setting up that way would definitely work. Also, it is probably technically more correct as well, as data generated by different processes should be considered different. However, we are thinking along the lines of data discovery, in which a critical dataset might be computed on different colos simultaneously for both DR and load balancing purposes. In this scenario, we would somehow like the end users to know that feed1 and feed2 are logically the same data, and they are free to pick one to use. Just wondering whether it make sense to support multiple sources and multiple targets without specifying partition (and maybe the target cluster have to specify the order of sources from which to copy). Also I am guessing that this "multiple sources and multiple targets without specifying partition" requirement must have came up before, and what was the thought process that went behind not supporting it in the end. Thanks a lot! John 2014-07-14 21:34 GMT-07:00 Satish Mittal : > Hi, > > Given that both ETL clusters are producing the same data-set independent = of > each other and the aim is to replicate the data-set within colo (to avoid > any cross-colo data movement), you could simply have 2 instances of the > same feed, one per colo: > > feed1: > > > > feed2: > > > > The 1st error was coming since multiple source replication was configured > (which needs partition expressions to be specified). Also that > configuration would have ended up moving data across colos, which is > against your desired goal. > > Thanks, > Satish > > > On Mon, Jul 14, 2014 at 11:52 PM, John Yu wrote: > > > Hey all, > > > > We currently have the following use case: > > Colo1 has 1 ETL cluster (Colo1-ETL) and 1 adhoc cluster (Colo1-A) > > Colo2 has 1 ETL cluster (Colo2-ETL) and 1 adhoc cluster (Colo2-A) > > > > Due to the bandwidth constraint between the two colo's, we are thinking > of > > having the 2 ETL clusters perform the same computation to generate the > same > > dataset, and have the 2 adhoc clusters pull from their respective > > colo-local ETL cluster. > > > > What would be a good way to configure this feed? > > > > I've tried the following: > > > > > > > > > > Error: Partition expression has to be specified for cluster colo1ETL as > > there are more than one source clusters > > > > > > > > > > > > Error: Feed: pve-intermediate should have atleast one source cluster > > defined > > > > > > Thanks! > > > > John > > > > -- > _____________________________________________________________ > The information contained in this communication is intended solely for th= e > use of the individual or entity to whom it is addressed and others > authorized to receive it. It may contain confidential or legally privileg= ed > information. If you are not the intended recipient you are hereby notifie= d > that any disclosure, copying, distribution or taking any action in relian= ce > on the contents of this information is strictly prohibited and may be > unlawful. If you have received this communication in error, please notify > us immediately by responding to this email and then delete it from your > system. The firm is neither liable for the proper and complete transmissi= on > of the information contained in this communication nor for any delay in i= ts > receipt. > --=20 =E4=BD=99=E5=AE=88=E4=B8=AD John Yu (Yu, Shoou-Jong) Mobile: 650-691-3314 --001a11332a2ed77e3a04fe407337--