Return-Path: X-Original-To: apmail-falcon-dev-archive@minotaur.apache.org Delivered-To: apmail-falcon-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1C92DC358 for ; Sun, 21 Dec 2014 20:24:44 +0000 (UTC) Received: (qmail 86875 invoked by uid 500); 21 Dec 2014 20:24:43 -0000 Delivered-To: apmail-falcon-dev-archive@falcon.apache.org Received: (qmail 86836 invoked by uid 500); 21 Dec 2014 20:24:43 -0000 Mailing-List: contact dev-help@falcon.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@falcon.incubator.apache.org Delivered-To: mailing list dev@falcon.incubator.apache.org Received: (qmail 86824 invoked by uid 99); 21 Dec 2014 20:24:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Dec 2014 20:24:42 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of psychidris@gmail.com designates 209.85.215.44 as permitted sender) Received: from [209.85.215.44] (HELO mail-la0-f44.google.com) (209.85.215.44) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Dec 2014 20:24:16 +0000 Received: by mail-la0-f44.google.com with SMTP id gd6so3134841lab.31 for ; Sun, 21 Dec 2014 12:22:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=iChUv4FPVTLk9QjW1Te0wvw9mjN5MMVvVX3p1wIV9fY=; b=icEfWIRMyJfawNxh7Ic0wGeqN11vwoE6xCE/eZGBvwe95CerXMidTMEJ3YT+5VpfV5 qizQRsNUowl3i2mMx9nGndcnep0uVlTV9UAuf6jvW5xOHrnI+lAdj7RUbPIlXyTCnY8C QUIPu7uHTH1WQyrEoQA48wOscu44YsbaTrntDq0/IDbvmWARv7tV5tUk8zUyK6zvvrGT Ic0r2ze+ZQGP2GuIpQEsCUQn7mcS9eAqgRugoEaIhLu7nDRiOVeC1j/D6feMk0IIbJbV kUwDnFsPLfQknFW5N88pv7O2qRWKbxDmqg4Q4hF0WzozDFoPjrqPkp8xI1oSgtYwjOXy wyCA== X-Received: by 10.112.52.229 with SMTP id w5mr5850790lbo.52.1419193364641; Sun, 21 Dec 2014 12:22:44 -0800 (PST) MIME-Version: 1.0 Received: by 10.25.216.216 with HTTP; Sun, 21 Dec 2014 12:22:04 -0800 (PST) In-Reply-To: References: From: Idris Ali Date: Mon, 22 Dec 2014 01:52:04 +0530 Message-ID: Subject: Re: [DISCUSS] Orchestration in Falcon To: dev@falcon.incubator.apache.org Content-Type: multipart/alternative; boundary=001a1133d3765dff0f050abfb4ef X-Virus-Checked: Checked by ClamAV on apache.org --001a1133d3765dff0f050abfb4ef Content-Type: text/plain; charset=UTF-8 Hi, I agree that oozie has certain limitation with respect to orchestration, recently oozie users have raised similar concerns regarding Point 2. (which is taken care by falcon by extending Oozie EL not for scheduling but at least for consuming the input set) Is it not Point 1, 2, 6 and 7 are already solved by using optional input mechanism in Falcon? I understand that still users need to specify frequency for the process. A few usecases/examples would really help. Thanks, -Idris On Sun, Dec 21, 2014 at 7:43 PM, Srikanth Sundarrajan wrote: > Hello Team, > > Since its inception Falcon has used Oozie for process orchestration as > well as feed life cycle phase executions, while this has worked reasonably > and allowed to make higher level capabilities available through Falcon, we > are increasing seeing scenarios where this is proving to be a limiting > factor. In its current form, Falcon relies on Oozie for both scheduling and > for workflow execution, due to which the scheduling is limited to time > based/cron based scheduling with additional gating conditions on data > availability. Also this imposes restrictions on datesets being > periodic/cyclic in nature. > > From an orchestration stand point, it would help if we can support > standard gating / scheduling primitives via Falcon: > > 1. Simple periodic scheduling with no gating conditions > 2. Cron based scheduling (day of week, day of the month, specific hours > and non-periodic) with no gating conditions > 3. Availability of new data (assuming monotonically increasing data > version, availavility of new versions) > 4. Changes to existing data (reinstatement - similar to late data handling) > 5. External trigger/notifications > 6. Availability of specific instances of data as declared as mandatory > dependency > 7. Availability of a minimum subset of instances of data declared as > mandatory depedency (at least 10 hourly instances of a day with 24 > instances for ex) > 8. Valid combinations of the above. > > In this context, I would like to propose that we move away from Oozie for > the orchestration requirements and have them implemented natively within > Falcon. It will no doubt make Falcon server bulkier and heavier in both > code and deployment, but seems like without it, the orchestration within > Falcon will be limited by capabilities available within Oozie. > > Please do note that this suggestion is restricted to the scheduling and > not to the workflow execution. > > Would like to hear from fellow developers and users on what your thoughts > are. Please do chime in with your views. > > Regards > Srikanth Sundarrajan > --001a1133d3765dff0f050abfb4ef--