Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 91E5D11FFE for ; Thu, 18 Sep 2014 08:24:12 +0000 (UTC) Received: (qmail 61282 invoked by uid 500); 18 Sep 2014 08:24:12 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 61232 invoked by uid 500); 18 Sep 2014 08:24:12 -0000 Mailing-List: contact user-help@flink.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.incubator.apache.org Delivered-To: mailing list user@flink.incubator.apache.org Received: (qmail 61217 invoked by uid 99); 18 Sep 2014 08:24:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Sep 2014 08:24:12 +0000 X-ASF-Spam-Status: No, hits=-1998.5 required=5.0 tests=ALL_TRUSTED,HTML_MESSAGE,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 18 Sep 2014 08:24:10 +0000 Received: (qmail 61130 invoked by uid 99); 18 Sep 2014 08:23:49 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Sep 2014 08:23:49 +0000 Received: from localhost (HELO mail-qg0-f44.google.com) (127.0.0.1) (smtp-auth username rmetzger, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Sep 2014 08:23:49 +0000 Received: by mail-qg0-f44.google.com with SMTP id z107so650960qgd.31 for ; Thu, 18 Sep 2014 01:23:48 -0700 (PDT) X-Received: by 10.229.84.133 with SMTP id j5mr5930955qcl.14.1411028628593; Thu, 18 Sep 2014 01:23:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.93.21 with HTTP; Thu, 18 Sep 2014 01:23:28 -0700 (PDT) In-Reply-To: References: From: Robert Metzger Date: Thu, 18 Sep 2014 10:23:28 +0200 Message-ID: Subject: Re: Job scheduling To: user@flink.incubator.apache.org Content-Type: multipart/alternative; boundary=001a113452562ce662050352b47e X-Virus-Checked: Checked by ClamAV on apache.org --001a113452562ce662050352b47e Content-Type: text/plain; charset=UTF-8 I don't think that we have a suggested way. If I would have the requirement, I would look into Oozie. I think its quite easy to add additional services (=Flink) into Oozie. In addition, it seems to have a REST interface and some other stuff. If you want, you could also implement one yourself and contribute it back to Flink. On Thu, Sep 18, 2014 at 10:11 AM, Flavio Pompermaier wrote: > Yes I was referring exactly to that, I was also involved in the Dopa > project :) > So, at the moment what is the suggested way to schedule jobs with Flink? > > > On Thu, Sep 18, 2014 at 9:48 AM, Robert Metzger > wrote: > >> Are you referring to this project? >> https://github.com/TU-Berlin/dopa-scheduler >> Its not an official repository of the Flink (Stratosphere) project. I >> think a PhD student at TU Berlin created the code there. >> >> >> >> On Thu, Sep 11, 2014 at 4:29 PM, Flavio Pompermaier > > wrote: >> >>> Of course with Flink I could in principle execute almost everything with >>> a single Job but, in general, I could write 2 different jobs and decide >>> from time to time when the second should be run. >>> That's why also metheor scripts are very useful :) >>> From what I know there was a scheduler in Stratosphere that was using >>> RabbitMQ, right? >>> >>> I would like to avoid to run linux commands and instead use some REST >>> interface to trigger or schedule jobs. >>> >>> Best, >>> Flavio >>> >>> >>> On Thu, Sep 11, 2014 at 4:07 PM, Fabian Hueske >>> wrote: >>> >>>> Hi Flavio, >>>> >>>> what exactly do you mean by scheduling? >>>> Do you want to run a job in regular intervals or execute a complex >>>> workflow? >>>> >>>> Oozie is primarily used to orchestrate the execution of MapReduce >>>> workflows. Since, MR is a rather inflexible programming model, complex >>>> tasks need to split up into multiple dependent jobs that are executed once >>>> their predecessors have finished. Oozie orchestrates this execution. >>>> In Flink, you can build a complex analysis flow as a single program and >>>> execute it. Hence, there is no need for a workflow scheduler such as Oozie. >>>> >>>> If you want to run a job in regular intervals, you can configure a cron >>>> job, that starts executes the CLI client or implement a Java or Scala >>>> program that submits jobs a certain points in time. >>>> >>>> Best, Fabian >>>> >>>> 2014-09-11 15:36 GMT+02:00 Flavio Pompermaier : >>>> >>>>> Hi to all, >>>>> >>>>> I'd like to know if there's an example of how to schedule a Job in >>>>> Flink. >>>>> Do we still need something like Oozie or Quartz or we can avoid them? >>>>> >>>>> Best, >>>>> Flavio >>>>> >>>> >>> > --001a113452562ce662050352b47e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I don't think that we have a suggested way.

If I would have the requirement, I would look into Oozie. I think = its quite easy to add additional services (=3DFlink) into Oozie. In additio= n, it seems to have a REST interface and some other stuff.

If you want, you could also implement one yourself and contribute = it back to Flink.

On Thu, Sep 18, 2014 at 10:11 AM, Flavio Pompermaier <pomp= ermaier@okkam.it> wrote:
Yes I was referring exactly to that, I was also involved in = the Dopa project :)
So, at the moment what is the suggested way to sche= dule jobs with Flink?


On Thu, Sep 18, 2014 at 9:48 AM, Robert= Metzger <rmetzger@apache.org> wrote:
Are you referring to this project?=C2=A0https= ://github.com/TU-Berlin/dopa-scheduler
Its not an official reposito= ry of the Flink (Stratosphere) project. I think a PhD student at TU Berlin = created the code there.=C2=A0



On Thu, Sep= 11, 2014 at 4:29 PM, Flavio Pompermaier <pompermaier@okkam.it><= /span> wrote:
Of course = with Flink I could in principle execute almost everything with a single Job= but, in general, I could write 2 different jobs and decide from time to ti= me when the second should be run.
That's why also metheor scripts a= re very useful :)
From what I know there was a scheduler in Stratos= phere that was using RabbitMQ, right?

I would like= to avoid to run linux commands and instead use some REST interface to trig= ger or schedule jobs.

Best,
Flavio
<= div>

On Thu, S= ep 11, 2014 at 4:07 PM, Fabian Hueske <fhueske@apache.org> = wrote:
Hi Flavio,

what exactly do you mean by scheduling?
=
Do you want to run a job in regular intervals or execute a complex wo= rkflow?

Oozie is primarily used to orchestrate the execution o= f MapReduce workflows. Since, MR is a rather inflexible programming model, = complex tasks need to split up into multiple dependent jobs that are execut= ed once their predecessors have finished. Oozie orchestrates this execution= .
In Flink, you can build a complex analysis flow as a single prog= ram and execute it. Hence, there is no need for a workflow scheduler such a= s Oozie.

If you want to run a job in regular intervals, you ca= n configure a cron job, that starts executes the CLI client or implement a = Java or Scala program that submits jobs a certain points in time.

Be= st, Fabian

2014-09-11 15:36 GMT+02:00 Flavio Pompermaier <pompermaier= @okkam.it>:
Hi to all,

I'd like to know if there's an example of how to schedule a Jo= b in Flink.
Do we still need something like Oozie or Quartz or we= can avoid them?

Best,
Flavio


--001a113452562ce662050352b47e--