airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edgardo Vega <edgardo.v...@gmail.com>
Subject Re: Cleanup
Date Tue, 04 Apr 2017 17:25:40 GMT
Max,

Thanks for the reply, it is much appreciated.  I am currently running ~10k
task a day in our test environment.

It is good to know where the archive point is and that I shouldn't have any
issues for a long time.

I was just thinking ahead as we get airflow into production environment.
Maybe in this case maybe way too far ahead.


Cheers,

Edgardo

On Tue, Apr 4, 2017 at 11:58 AM, Maxime Beauchemin <
maximebeauchemin@gmail.com> wrote:

> We run ~50k tasks a day at Airbnb. How many tasks/day are you planning on
> running?
>
> Though you can archive the `task_instance` and `job` table down the line,
> but that shouldn't be a concern until you hit tens of millions of entries.
> Then you can setup a daily Airflow job that archives some of these entries.
> I believe we do it based on `start_date` and move rows to some other table
> in the same db.
>
> Max
>
> On Mon, Apr 3, 2017 at 5:30 PM, Edgardo Vega <edgardo.vega@gmail.com>
> wrote:
>
> > I have been playing with airflow for a few days and it's not obvious what
> > will happen down the road when we have lots of dags over a long period of
> > time. I set a fake dag to run once a minute for a few days and everything
> > seems okay except the graph view dropdown which works but take a few
> > seconds to show up.
> >
> > Is there a way roll older data out of the system in order to clean things
> > visually and keep the database at a smallish size?
> >
> > --
> > Cheers,
> >
> > Edgardo
> >
>



-- 
Cheers,

Edgardo

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message