airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <lance.nors...@gmail.com>
Subject Re: How do you use pools?
Date Fri, 20 May 2016 01:34:39 GMT
Ok, thanks.

Yes, there is a problem with over-subscribing pools. If your pool is set to
4, you can get 15 active tasks and another 20 waiting. This is still true
in 1.7.0.

Lance

On Thu, May 19, 2016 at 5:21 PM, Chris Riccomini <criccomini@apache.org>
wrote:

> We do the same as well. BigQuery limits UDF usage to 6, so any DAG that
> uses a UDF goes in a pool (the 'udf' pool), which has a max of 6.
>
> On Thu, May 19, 2016 at 4:27 PM, siddharth anand <r39132@gmail.com> wrote:
>
> > Hi Lance!
> > Yes, we do the same. Specifically, we have multiple DAGs that share
> access
> > to a Spark cluster through the use of Pools. By setting the pool size to
> > say 4, we remove the possibility of some backfill swamping the Spark
> > cluster. BTW, there were some bugs with over-subscription of pools. It's
> > not a common occurrence, but it has been reported.
> >
> > -s
> >
> > On Thu, May 19, 2016 at 9:37 PM, Lance Norskog <lance.norskog@gmail.com>
> > wrote:
> >
> > > How should we use pools in our dags?
> > >
> > > We do a lot of analytics queries and copying between databases. I've
> set
> > up
> > > pools for each database instance so that we avoid overloading instances
> > > with queries. Is this the right approach?
> > >
> > > Thanks,
> > >
> > > --
> > > Lance Norskog
> > > lance.norskog@gmail.com
> > > Redwood City, CA
> > >
> >
>



-- 
Lance Norskog
lance.norskog@gmail.com
Redwood City, CA

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message