airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bolke de Bruin <bdbr...@gmail.com>
Subject Re: Sensors
Date Mon, 23 Oct 2017 17:10:42 GMT
I think you can do something like Azure functions blob storage binding and let that kick off
a dag by triggering it from the Rest API:

https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob <https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob>

I don’t use Azure so it might not fit your case.

Bolke

> On 23 Oct 2017, at 16:15, Grant Nicholas <grantnicholas2015@u.northwestern.edu>
wrote:
> 
> It sounds like you want a background daemon that continuously monitors the
> status of some external system and triggers things on a condition. This
> does not sound like an ETL job, and thus airflow is not a great fit for
> this type of problem. That said, there are workarounds like you mentioned.
> One easy workaround if you can handle a delay between `condition happens ->
> dag triggers` is setting your controller dag to have a recurring schedule
> (ie: not None). Then when that controlling dag is triggered, you just
> perform your sensor check once and then trigger/don't trigger another dag
> depending on the condition. The thing I'd be worried about with your
> `trigger dagrun` approach is if the trigger dagrun operator fails for any
> reason you'll stop monitoring the external system, while with the scheduled
> approach you don't have to worry about the failure modes of retrying failed
> dags/etc.
> 
> On Mon, Oct 23, 2017 at 2:30 AM, Niels Zeilemaker <niels@zeilemaker.nl>
> wrote:
> 
>> Hi Guys,
>> 
>> I've created a Sensor which is monitoring the number of files in an
>> Azure Blobstore. If the number of files increases, then I would like
>> to trigger another dag. This is more or less similar to the
>> example_trigger_controller_dag.py and example_trigger_target_dag.py
>> setup.
>> 
>> However, after triggering the target DAG I would want my controller
>> DAG to start monitoring the Blobstore again. But since the schedule of
>> the controller DAG is set to None, it doesn't continue monitoring. I
>> "fixed" this by adding a TriggerDAG which schedules a new run of the
>> Controller DAG. But this feels a bit like a hack.
>> 
>> Does someone have any experience which such a continuous monitoring
>> sensor? Or know of a better way to achieve this?
>> 
>> Thanks,
>> Niels
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message