airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maycock, Luke" <luke.mayc...@affiliate.oliverwyman.com>
Subject Re: Skip task
Date Tue, 15 Nov 2016 10:50:08 GMT
Thank you for taking the time to respond. This is a great approach if you know at the time
of creating the DAG which tasks you expect to need to skip. However, I don't think this is
exactly the use case I have. For example, I may be expecting a file to arrive in an FTP folder
for loading into a database but one day it doesn't arrive so I just want to skip that task
on that day.


Our workflows commonly have around 20 of these types of tasks in. I could configure all of
these tasks in the way you suggested in case I ever need to skip one of them. However, I'd
prefer not to have to set the tasks up this way and instead have the ability just to skip
a task on an ad-hoc basis. I could then also use this functionality to add the ability to
run from a certain point in a DAG or to a certain point in the DAG.



Thanks,
Luke Maycock
OLIVER WYMAN
luke.maycock@affiliate.oliverwyman.com<mailto:luke.maycock@affiliate.oliverwyman.com>
www.oliverwyman.com<http://www.oliverwyman.com/>



________________________________
From: siddharth anand <sanand@apache.org>
Sent: 14 November 2016 19:48
To: dev@airflow.incubator.apache.org
Subject: Re: Skip task

For cases like this, we (Agari) use the following approach :

  1. Create a Variable in the UI of type boolean such as *enable_feature_x*
  2. Use a ShortCircuitOperator (or BranchPythonOperator) to Skip
  downstream processing based on the value of *enable_feature_x*
  3. Assuming that you don't want to skip ALL downstream tasks, you can
  use a trigger_rule of all_done to resume processing some portion of your
  downstream DAG after skipping an upstream portion

In other words, there is already a means to achieve what you are asking for
today. You can change the value of via *enable_feature_x  *the UI. If you'd
like to enhance the UI to better capture this pattern, pls submit a PR.
-s

On Thu, Nov 10, 2016 at 1:20 PM, Maycock, Luke <
luke.maycock@affiliate.oliverwyman.com> wrote:

> Hi Gerard,
>
>
> I see the new status as having a number of uses:
>
>  1.  A user can manually set a task to skip in a DAG run via the UI.
>  2.  We can then make use of this new status to add the following
> functionality to Airflow:
>     *   Run a DAG run up to a certain point and have the rest of the tasks
> have the new status.
>     *   Run a DAG run from a certain task to the end, setting all
> pre-requisite tasks to have this new status.
>
> I am happy to be challenged on the above use cases if there are better
> ways to achieve the same things.
>
> Cheers,
> Luke Maycock
> OLIVER WYMAN
> luke.maycock@affiliate.oliverwyman.com<mailto:luke.
> maycock@affiliate.oliverwyman.com>
> www.oliverwyman.com<http://www.oliverwyman.com/>
>
>
>
> ________________________________
> From: Gerard Toonstra <gtoonstra@gmail.com>
> Sent: 09 November 2016 18:08
> To: dev@airflow.incubator.apache.org
> Subject: Re: Skip task
>
> Hey Luke,
>
> Who or what makes the decision to skip processing that task?
>
> Rgds,
>
> Gerard
>
> On Wed, Nov 9, 2016 at 2:39 PM, Maycock, Luke <
> luke.maycock@affiliate.oliverwyman.com> wrote:
>
> > Hi Gerard,
> >
> >
> > Thank you for your quick response.
> >
> >
> > I am not trying to implement this for a specific operator but rather
> > trying to add it as a feature for any task in any DAG.
> >
> >
> > Given that the skipped states propagate where all directly upstream tasks
> > are skipped, I don't think this is the state we want to use. For the
> > functionality I'm looking for, I think I'll need to introduce a new
> status,
> > maybe 'disabled'.
> >
> >
> > Again, thanks for your response.
> >
> >
> > Cheers,
> > Luke Maycock
> > OLIVER WYMAN
> > luke.maycock@affiliate.oliverwyman.com<mailto:luke.
> > maycock@affiliate.oliverwyman.com>
> > www.oliverwyman.com<http://www.oliverwyman.com/>
> >
> >
> >
> > ________________________________
> > From: Gerard Toonstra <gtoonstra@gmail.com>
> > Sent: 08 November 2016 18:19
> > To: dev@airflow.incubator.apache.org
> > Subject: Re: Skip task
> >
> > Also in 1.7.1.3, there's the ShortCircuitOperator, which can give you an
> > example.
> >
> > https://github.com/apache/incubator-airflow/blob/1.7.1.
> > 3/airflow/operators/python_operator.py
> >
> > You'd have to modify this to your needs, but the way it works is that if
> > the condition evaluates to True, none of the
> > downstream tasks are actually executed, they'd be skipped. The reason for
> > putting them into SKIPPED state is that
> > the DAG final result would still be SUCCESS and not failed.
> >
> > You could copy the operator from there and don't do the full "for loop",
> > only pick the tasks immediately downstream
> > from this operator and skip that. Or... if you need to skip additional
> > tasks downstream, add a parameter "num_tasks"
> > that decide on a halting condition for the for loop.
> >
> > I believe that should work. I didn't try that here, but you can test that
> > and see what it does for you.
> >
> >
> > If you want this as a UI capability... for example have a human operator
> > decide on skipping this yes or not, then
> > maybe the best way forward would be some kind of highly custom plugin
> with
> > its own view. In the end, you'd basically
> > do the same action in the backend, whether the python cond evaluates to
> > True or the button is clicked.
> >
> > In the plugin case though, you'd have to keep the UI and the structure of
> > the DAG in sync and aligned, otherwise
> > it'd become a mess.... Airflow wasn't really developed for workflow/human
> > interaction, but in workflows where only
> > automated processes are involved. That doesn't mean that you can't do
> > anything like that, but it may be costly resource
> > wise to get this done. For example, on the basis of the BranchOperator,
> you
> > could call an external API to verify if a decision
> > was taken on a case, then follow branch A or B if the decision is there
> or
> > put the state back into UP_FOR_RETRY.
> > At the moment though, there's no programmatic way to reschedule that task
> > to some minutes or hours into the future before
> > it's looked at again, unless you really dive into airflow, scheduling
> > semantics (@once vs. other schedules) and how
> > the scheduler works.
> >
> > Rgds,
> >
> > Gerard
> >
> >
> >
> >
> > On Tue, Nov 8, 2016 at 5:30 PM, Maycock, Luke <
> > luke.maycock@affiliate.oliverwyman.com> wrote:
> >
> > > Hi All,
> > >
> > >
> > > I am using Airflow 1.7.1.3 and have a particular requirement, which I
> > > don't think is currently supported by Airflow but just wanted to check
> in
> > > case I was missing something.
> > >
> > >
> > > I occasionally wish to skip a particular task in a given DAG run such
> > that
> > > the task does not run for that DAG run. Is this functionality available
> > in
> > > Airflow?
> > >
> > >
> > > I am aware of the BranchPythonOperator (https://airflow.incubator.
> > > apache.org/concepts.html#branching) but I don't think believe this is
> > > exactly what I am looking for.
> > >
> > >
> > > I am thinking that a button in the UI alongside the 'Mark Success' and
> > > 'Run' buttons would be appropriate.
> > >
> > >
> > > If the functionality does not exist, does anyone have any suggestions
> on
> > > ways to implement this?
> > >
> > >
> > > Cheers,
> > > Luke Maycock
> > > OLIVER WYMAN
> > > luke.maycock@affiliate.oliverwyman.com<mailto:luke.
> > > maycock@affiliate.oliverwyman.com>
> > > www.oliverwyman.com<http://www.oliverwyman.com/>
> > >
> > >
> > > ________________________________
> > > This e-mail and any attachments may be confidential or legally
> > privileged.
> > > If you received this message in error or are not the intended
> recipient,
> > > you should destroy the e-mail message and any attachments or copies,
> and
> > > you are prohibited from retaining, distributing, disclosing or using
> any
> > > information contained herein. Please inform us of the erroneous
> delivery
> > by
> > > return e-mail. Thank you for your cooperation.
> > >
> >
> > ________________________________
> > This e-mail and any attachments may be confidential or legally
> privileged.
> > If you received this message in error or are not the intended recipient,
> > you should destroy the e-mail message and any attachments or copies, and
> > you are prohibited from retaining, distributing, disclosing or using any
> > information contained herein. Please inform us of the erroneous delivery
> by
> > return e-mail. Thank you for your cooperation.
> >
>
> ________________________________
> This e-mail and any attachments may be confidential or legally privileged.
> If you received this message in error or are not the intended recipient,
> you should destroy the e-mail message and any attachments or copies, and
> you are prohibited from retaining, distributing, disclosing or using any
> information contained herein. Please inform us of the erroneous delivery by
> return e-mail. Thank you for your cooperation.
>

________________________________
This e-mail and any attachments may be confidential or legally privileged. If you received
this message in error or are not the intended recipient, you should destroy the e-mail message
and any attachments or copies, and you are prohibited from retaining, distributing, disclosing
or using any information contained herein. Please inform us of the erroneous delivery by return
e-mail. Thank you for your cooperation.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message