airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maycock, Luke" <luke.mayc...@affiliate.oliverwyman.com>
Subject Re: Skip task
Date Thu, 10 Nov 2016 13:20:28 GMT
Hi Gerard,


I see the new status as having a number of uses:

 1.  A user can manually set a task to skip in a DAG run via the UI.
 2.  We can then make use of this new status to add the following functionality to Airflow:
    *   Run a DAG run up to a certain point and have the rest of the tasks have the new status.
    *   Run a DAG run from a certain task to the end, setting all pre-requisite tasks to have
this new status.

I am happy to be challenged on the above use cases if there are better ways to achieve the
same things.

Cheers,
Luke Maycock
OLIVER WYMAN
luke.maycock@affiliate.oliverwyman.com<mailto:luke.maycock@affiliate.oliverwyman.com>
www.oliverwyman.com<http://www.oliverwyman.com/>



________________________________
From: Gerard Toonstra <gtoonstra@gmail.com>
Sent: 09 November 2016 18:08
To: dev@airflow.incubator.apache.org
Subject: Re: Skip task

Hey Luke,

Who or what makes the decision to skip processing that task?

Rgds,

Gerard

On Wed, Nov 9, 2016 at 2:39 PM, Maycock, Luke <
luke.maycock@affiliate.oliverwyman.com> wrote:

> Hi Gerard,
>
>
> Thank you for your quick response.
>
>
> I am not trying to implement this for a specific operator but rather
> trying to add it as a feature for any task in any DAG.
>
>
> Given that the skipped states propagate where all directly upstream tasks
> are skipped, I don't think this is the state we want to use. For the
> functionality I'm looking for, I think I'll need to introduce a new status,
> maybe 'disabled'.
>
>
> Again, thanks for your response.
>
>
> Cheers,
> Luke Maycock
> OLIVER WYMAN
> luke.maycock@affiliate.oliverwyman.com<mailto:luke.
> maycock@affiliate.oliverwyman.com>
> www.oliverwyman.com<http://www.oliverwyman.com/>
>
>
>
> ________________________________
> From: Gerard Toonstra <gtoonstra@gmail.com>
> Sent: 08 November 2016 18:19
> To: dev@airflow.incubator.apache.org
> Subject: Re: Skip task
>
> Also in 1.7.1.3, there's the ShortCircuitOperator, which can give you an
> example.
>
> https://github.com/apache/incubator-airflow/blob/1.7.1.
> 3/airflow/operators/python_operator.py
>
> You'd have to modify this to your needs, but the way it works is that if
> the condition evaluates to True, none of the
> downstream tasks are actually executed, they'd be skipped. The reason for
> putting them into SKIPPED state is that
> the DAG final result would still be SUCCESS and not failed.
>
> You could copy the operator from there and don't do the full "for loop",
> only pick the tasks immediately downstream
> from this operator and skip that. Or... if you need to skip additional
> tasks downstream, add a parameter "num_tasks"
> that decide on a halting condition for the for loop.
>
> I believe that should work. I didn't try that here, but you can test that
> and see what it does for you.
>
>
> If you want this as a UI capability... for example have a human operator
> decide on skipping this yes or not, then
> maybe the best way forward would be some kind of highly custom plugin with
> its own view. In the end, you'd basically
> do the same action in the backend, whether the python cond evaluates to
> True or the button is clicked.
>
> In the plugin case though, you'd have to keep the UI and the structure of
> the DAG in sync and aligned, otherwise
> it'd become a mess.... Airflow wasn't really developed for workflow/human
> interaction, but in workflows where only
> automated processes are involved. That doesn't mean that you can't do
> anything like that, but it may be costly resource
> wise to get this done. For example, on the basis of the BranchOperator, you
> could call an external API to verify if a decision
> was taken on a case, then follow branch A or B if the decision is there or
> put the state back into UP_FOR_RETRY.
> At the moment though, there's no programmatic way to reschedule that task
> to some minutes or hours into the future before
> it's looked at again, unless you really dive into airflow, scheduling
> semantics (@once vs. other schedules) and how
> the scheduler works.
>
> Rgds,
>
> Gerard
>
>
>
>
> On Tue, Nov 8, 2016 at 5:30 PM, Maycock, Luke <
> luke.maycock@affiliate.oliverwyman.com> wrote:
>
> > Hi All,
> >
> >
> > I am using Airflow 1.7.1.3 and have a particular requirement, which I
> > don't think is currently supported by Airflow but just wanted to check in
> > case I was missing something.
> >
> >
> > I occasionally wish to skip a particular task in a given DAG run such
> that
> > the task does not run for that DAG run. Is this functionality available
> in
> > Airflow?
> >
> >
> > I am aware of the BranchPythonOperator (https://airflow.incubator.
> > apache.org/concepts.html#branching) but I don't think believe this is
> > exactly what I am looking for.
> >
> >
> > I am thinking that a button in the UI alongside the 'Mark Success' and
> > 'Run' buttons would be appropriate.
> >
> >
> > If the functionality does not exist, does anyone have any suggestions on
> > ways to implement this?
> >
> >
> > Cheers,
> > Luke Maycock
> > OLIVER WYMAN
> > luke.maycock@affiliate.oliverwyman.com<mailto:luke.
> > maycock@affiliate.oliverwyman.com>
> > www.oliverwyman.com<http://www.oliverwyman.com/>
> >
> >
> > ________________________________
> > This e-mail and any attachments may be confidential or legally
> privileged.
> > If you received this message in error or are not the intended recipient,
> > you should destroy the e-mail message and any attachments or copies, and
> > you are prohibited from retaining, distributing, disclosing or using any
> > information contained herein. Please inform us of the erroneous delivery
> by
> > return e-mail. Thank you for your cooperation.
> >
>
> ________________________________
> This e-mail and any attachments may be confidential or legally privileged.
> If you received this message in error or are not the intended recipient,
> you should destroy the e-mail message and any attachments or copies, and
> you are prohibited from retaining, distributing, disclosing or using any
> information contained herein. Please inform us of the erroneous delivery by
> return e-mail. Thank you for your cooperation.
>

________________________________
This e-mail and any attachments may be confidential or legally privileged. If you received
this message in error or are not the intended recipient, you should destroy the e-mail message
and any attachments or copies, and you are prohibited from retaining, distributing, disclosing
or using any information contained herein. Please inform us of the erroneous delivery by return
e-mail. Thank you for your cooperation.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message