airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerard Toonstra <gtoons...@gmail.com>
Subject Re: Airflow for business process workflows
Date Fri, 09 Sep 2016 05:45:23 GMT
Dinesh,

Interesting use case. I'm not sure how this will work out for you
eventually compared to a specialized workflow tool,
but here are some considerations that you should make to evaluate your
chances of success:

A complex business workflow will at some point require some more complex
input from a user beyond a decision.
Airflow has no UI to do that for you, so there has to be something else
where these 'cases' are handled and the input
can be gathered and then merged into the main workflow.

If you allow users to individually run tasks like you describe and
delete/remove DAGs, it may become a headache pretty soon,
also because these DAGs are so short-lived.


Consider working with 'processing lists'  for such simple tasks. For
example set up a google sheet where you have pages
where people can enter device id's, then set up a single task that reads
from that sheet and does all TBD devices in one go.
What you win from this approach is that if people need to go back to a
specific device, you don't have to wade through
a complex interface, just find that line+date in the sheet you set up and
delete or modify the line. Undoubtedly there
will be cases where a DAG doesn't get deleted and admins need to jump in to
solve issues. With a google sheet, you
also get rudimentary access control where people can view things, but not
edit and you get dropdown lists (validation) for
particular fields of interest.

For complexer workflows, you could look at google forms (survey tool). That
allows you to send parameters on the URL that pre-populate
particular fields in that "survey" (your device id). Then the questions are
filled in and the "response" of that survey goes to another
google sheet yet again. From there, you should be able to direct airflow
towards that response sheet, pick up surveys that were not
yet processed and take more complex actions. The benefit of this approach
is that you maintain all history in one place in a format that's
easy to read. Through airflow, you can then generate URL's for devices to
be handled and send them through email to particular people.

Rgds,

Gerard



On Thu, Sep 8, 2016 at 6:12 PM, Dinesh Sharma <dsharma@bandwidthx.com>
wrote:

> Hi All,
> I'm with BandwidthX, a wireless tech company in San Diego.
> We're trying to have one workflow tool that can be used for both business
> process workflows as well as data pipelines. I think Airflow can do that. I
> also think that it will be a good case study for Airflow given that I see
> people using it primarily for data pipelines.
>
> We're starting with the business process workflows first wherein a user
> action can lead to the scheduling of one-time tasks e.g. activate a
> particular device on a particular day/time. This task may or may not have
> dependencies. A subsequent user action could potentially change the date
> time of the scheduled task or could potentially cancel the already
> scheduled task.
>
> I think Airflow can do that with *schedule_interval=once* and
> *start_date=scheduled_date_time*; ideally if they can be passed in as
> command line parameters. I made it work by writing a python script that
> takes these params and generates the script with supplied start_date for
> the DAG and puts that script in DAGs folder. I also added a dependent
> cleanup task to this script that actually deletes .py and .pyc files of
> dynamically generated the DAG.
>
> Is there a better way to do it? Any resource that you can point me to?
>
> PS
> I'm already part of https://gitter.im/apache/incubator-airflow.
>
> Thanks
>
> --
> Dinesh Sharma
> BandwidthX
> dsharma@bandwidthx.com
> (760) 203-4955 Ext. 121
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message