airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Greene <br...@heisenbergwoodworking.com>
Subject Re: Slot pools correct usage
Date Sat, 07 Apr 2018 11:49:44 GMT
So what’s it doing (your config)?  Does it work if you don’t use pools?  What about if
the pool is if size 2?   What if just one dag runs?  Have you ever seen this query work, or
is it just since you started messing with pools that it stopped working?

I use 1 pool, no priority (I don’t care about sequence), and it “throttles” fine...

Which executor are you using?  I’m not familiar enough with the intricacies to know if the
pool settings are honored with different executors, but I’m using CeleryExecutor with success.

B

Sent from a device with less than stellar autocorrect

> On Apr 6, 2018, at 10:40 PM, Manish Trivedi <trivmanish@gmail.com> wrote:
> 
> Hi Brian,
> 
> Really appreciate your quick reply. Just to be clear, I did not intend to
> run them in particular order. as a matter of fact, these are expensive db
> queries that I cant afford to run in parallel.
> I think I have setup the tasks correctly to use pool but may be missing the
> priority_weight setting correctly. Appreciate if you could run by your
> configs just to see if I am not missing any simple point.
> 
> thanks much,
> Manish
> 
> On Fri, Apr 6, 2018 at 6:18 PM, Brian Greene <
> brian@heisenbergwoodworking.com> wrote:
> 
>> To be clear, you’re hoping that setting the slots to 1 will cause the
>> tasks across district dags to run in order based on the assumption that
>> they’ll queue up and then execute off the pool?
>> 
>> I don’t think it will quite work that way - there’s no guarantee the
>> scheduler will execute your tasks across dags in any particular sequence,
>> and if 1 is “faster” than the other for sure they don’t “line up”.  Thus,
>> no way to ensure they’ll queue in the right order.
>> 
>> I successfully use pools across many dags to limit access to an expensive
>> resource and it works really well, but my design doesn’t require they
>> execute in any particular order, each idempotent.
>> 
>> I’m curious as to your design/constraints - could you elaborate?
>> 
>> Brian
>> 
>> Sent from a device with less than stellar autocorrect
>> 
>>> On Apr 6, 2018, at 3:46 PM, Manish Trivedi <trivmanish@gmail.com> wrote:
>>> 
>>> Hi Airflow devs,
>>> 
>>> I have a use case to limit the # of calls to a certain database. I am
>> using
>>> the pool along with priority weight to schedule the tasks to the slot
>> pool.
>>> I have around 5 operators that I need to execute in serial order across
>>> different dags.
>>> 
>>> Slot pool is created with "1" slot to ensure sequential exection. I am
>> not
>>> able to achieve the desired function with current setup.
>> 

Mime
View raw message