airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ash Berlin-Taylor <...@apache.org>
Subject Re: Custom scheduler support in Airflow
Date Thu, 31 Jan 2019 21:44:18 GMT
>  wouldn't it be easy if we have some custom scheduler support in Airflow

Don't underestimate JUST how much work this would actually involve.

Right now given the solutions presented, and the ability to trigger DAGs in Airflow via the
existing API I am not convinced that Airflow needs the added complexity this change would
involve.

-ash


> On 31 Jan 2019, at 21:04, abhishek sharma <abhioncbr.apache@gmail.com> wrote:
> 
> Thanks, Brian & Ben.
> 
> So, you guys also have such workflows and through Sensors or running DAGs
> frequently things are working out for your guys. In my case, I am running
> an application which works as a 'custom scheduler' and triggers DAGs based
> on event occurrence.
> 
> Question to you guys, wouldn't it be easy if we have some custom scheduler
> support in Airflow? Also, I think that would open more possibilities for
> scheduling  DAGs.
> 
> Thanks, {{Abhishek}}
> 
> On Thu, Jan 31, 2019 at 3:25 PM Ben Tallman <btallman@gmail.com> wrote:
> 
>> To solve that exact problem, we ran a DAG on a frequent schedule, that
>> basically acted as a scheduler. It used a shell script to kick off other
>> DAGS. Possibly a custom scheduler would be a more elegant solution.
>> 
>> Thanks,
>> Ben
>> 
>> --
>> Ben Tallman - 503.680.5709
>> 
>> 
>> On Thu, Jan 31, 2019 at 11:03 AM abhishek sharma <
>> abhioncbr.apache@gmail.com>
>> wrote:
>> 
>>> Hi Ben,
>>> 
>>> Just copying my comment form ticket.
>>> 
>>> I think current airflow scheduler schedule DAGs only on time-basis (based
>>> on cron schedule string). ***Is it correct understanding?*
>>> 
>>> How to approach a scenario where I want to trigger a DAG based on some
>>> event which is not so predictable/regular on time basis?
>>> 
>>>   - One is to use sensor and DAG flow will start by first running a
>> sensor
>>>   task for checking an event and when an event happened actual
>> processing
>>> get
>>>   a start.
>>>   - Second, is to have a DAG with None schedule and it gets triggered by
>>>   some other application or utility which checks for an event
>> occurrence .
>>> 
>>> If your most of the flow(DAGs) are suppose to run on this logic, then
>>> sensor doesn't make sense and hence left with the second approach only,
>>> which is nothing but a custom scheduling of DAGs.  ****Is it agreeable
>> use
>>> case for the custom scheduler?*
>>> 
>>> Thanks, {{Abhishek}}
>>> 
>>> 
>>> 
>>> On Thu, Jan 31, 2019 at 11:53 AM Ben Tallman <btallman@gmail.com> wrote:
>>> 
>>>> Can you explain a bit more what you are thinking for a custom
>> scheduler?
>>>> It's been awhile, but we added support for cron schedules without
>>> backfill
>>>> awhile back, so I'm wondering what you are thinking of adding with
>> this?
>>>> 
>>>> Thanks,
>>>> Ben
>>>> 
>>>> --
>>>> Ben Tallman - 503.680.5709
>>>> 
>>>> 
>>>> On Thu, Jan 31, 2019 at 8:29 AM abhishek sharma <
>>>> abhioncbr.apache@gmail.com>
>>>> wrote:
>>>> 
>>>>> Hi All,
>>>>> 
>>>>> Created a ticket(https://issues.apache.org/jira/browse/AIRFLOW-3775)
>>> for
>>>>> supporting custom scheduler in Airflow.
>>>>> 
>>>>> The idea is to have a scheduler base class which can be extended for
>>>>> writing a custom scheduler. The logic of custom scheduling is user
>>>>> specific, and at the DAGs task level we can mention the scheduler
>> type,
>>>> and
>>>>> that scheduler will be used for starting a task. [Naive Idea]
>>>>> 
>>>>> Can we please discuss whether we need such functionality in Airflow
>> or
>>>> not
>>>>> and If yes then we will proceed with the design and implementation.
>>>>> 
>>>>> Thanks
>>>>> Abhishek Sharma
>>>>> 
>>>> 
>>> 
>> 


Mime
View raw message