From dev-return-6665-archive-asf-public=cust-asf.ponee.io@airflow.incubator.apache.org Wed Sep 26 05:00:49 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 3B80518061A for ; Wed, 26 Sep 2018 05:00:49 +0200 (CEST) Received: (qmail 48259 invoked by uid 500); 26 Sep 2018 03:00:48 -0000 Mailing-List: contact dev-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list dev@airflow.incubator.apache.org Received: (qmail 48247 invoked by uid 99); 26 Sep 2018 03:00:47 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Sep 2018 03:00:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id D3214C00A6 for ; Wed, 26 Sep 2018 03:00:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.889 X-Spam-Level: * X-Spam-Status: No, score=1.889 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id gQkRUPTiEtUt for ; Wed, 26 Sep 2018 03:00:45 +0000 (UTC) Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id B8CBF5F478 for ; Wed, 26 Sep 2018 03:00:43 +0000 (UTC) Received: by mail-qk1-f171.google.com with SMTP id q20-v6so10501943qkc.12 for ; Tue, 25 Sep 2018 20:00:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=2xMxGULz6gYTJraoxrjfjVwnDj5aprURobk2rMgUSjg=; b=Hc258ItQM+sGfPefocaAdGk7w1NeYERm2Etnn6HZxEGLX9gOV1sAcXjrADKYYift5R ltFWJdO/nvuRPAakBb21zT6t9AZ8xnWGDPkc+gezFCU/uIR7NoF5t7o4uYHQ2rzXBjeJ 4s38xZMobeJaXudgdgaELTKkajb4HkRvKQ7Y8WwDlrHJBKVgZXL56zdETtRK9VB1nnuF wC0CXm6aGss+2gQWolvEMc9psDxeofuqXXRobQ6PfsSLY9Sj+7seoG+sX5LVwAn68y06 qOz+XOSmVrcByJCGQqh/Fg9L70gKS3dy1woZgfITc+HY4R4B+/MqjYRmmLpp26t1BA4I QZmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=2xMxGULz6gYTJraoxrjfjVwnDj5aprURobk2rMgUSjg=; b=dp9cmQ6/TdIUMXCIg/vywL3q9JKG5lIM7hF9NgKFOlDoyHPDStEDtyFD11eKWMdWjf 0fvN69S9RLQUbxDQB0343q99SSWSdZmoxLt/IFg7EHe7SMi39Gs88lRLtG4DNCpgfdhN b8n3BSGCTgDVwKh/VFRXILvfKls8XjGEJtfGV/c3IUFCfbtxTUjpngkZgSbOCbPk7VNr yFfx+QrdQXYvMAjiruqUNBM9WTWSTqtK1ZEnrluFT0aPG5GmAgu+FOCF+mCHt3MW9U9z RyrUKhXwg8PpRURMAN8JvBW5VQUoRH37J6KL6IouCzq0L90EIJZy6alExa5jdTEodmOC bYwQ== X-Gm-Message-State: ABuFfojXhXZXvCXAAmznR1H9qnhSAh4ZQRXIJ+UCLUSpQfn9RCaXGQZJ W75yc5cnGnQy9RqamR5LBeEFltyXs9rXquD38Tvu+g== X-Google-Smtp-Source: ACcGV61Rh/6tVJ3Wvyy9i7f71hg7FWZdQFgdKnQUl5QbgqUh9UNCgrGdCNgFrhXuoEZfMFIITe7csmCjrIRnLqKzhaA= X-Received: by 2002:a37:2b0a:: with SMTP id r10-v6mr2843364qkh.292.1537930842134; Tue, 25 Sep 2018 20:00:42 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Matt Davis Date: Tue, 25 Sep 2018 20:00:30 -0700 Message-ID: Subject: Re: Airflow 1.10 Migration Duration To: dev@airflow.incubator.apache.org Content-Type: multipart/alternative; boundary="00000000000088dc650576bd6dda" --00000000000088dc650576bd6dda Content-Type: text/plain; charset="UTF-8" Good point about mentioning the database specifics, thanks. It's a Postgres 9.6.6 DB running in AWS RDS in an db.r3.large instance (2 vCPUs, 15 GB of RAM). Not sure what you mean by online/offline, but we timed the migrations in a test run against a database with nothing else going on at the time. - Matt On Tue, Sep 25, 2018 at 7:54 PM Ruiqin Yang wrote: > Thank you Taylor, the db-cleanup DAG is very nice! Got a question for you, > should we expect the DB migration to be backward compatible, i.e. would > 1.8.x cluster run fine with upgraded DB? > > Thank you! > Kevin Y > > On Tue, Sep 25, 2018 at 6:14 PM Taylor Edmiston > wrote: > > > I haven't done 1.8.x to 1.10.x in one go, but multiple hours seems long > for > > running a handful of Alembic migrations on 10M rows. It might be worth > > noting if you're using MySQL or Postgres and how your db is hosted... I > > wonder if there's a bottleneck at play here. > > > > Also, are you running the migrations in online or offline mode? > > > > You may see a performance improvement if you collapse all migrations into > > one then apply that (https://stackoverflow.com/a/34492022/149428). > > > > I prefer to keep all of my metadata in place personally, but the > db-cleanup > > DAG in https://github.com/teamclairvoyant/airflow-maintenance-dags has > > been > > brought up before. > > > > T > > > > *Taylor Edmiston* > > Blog | LinkedIn > > | Stack Overflow > > | Developer > Story > > > > > > > > On Tue, Sep 25, 2018 at 8:30 PM, Sid Anand wrote: > > > > > I checked with our Ops guy and he mentioned that when he upgraded from > > > 1.8.x to 1.9.x, it took a few seconds. We had 3M rows in the > > task_instance > > > table and run MySQL 5.7. > > > > > > -s > > > > > > On Tue, Sep 25, 2018 at 4:54 PM Matt Davis > wrote: > > > > > > > Hi folks, > > > > > > > > Here at Clover we're excitedly migrating to Airflow 1.10 (thanks for > > > > everyone's hard work on that!). We're finding that it's taking about > 2 > > > > hours to apply all the migrations to go from Airflow 1.8 to 1.10, > > largely > > > > driven by the 10 million rows in our task_instance table. That got us > > > > wondering what kind of maintenance people do on their Airflow > metadata > > > > databases. Do folks mostly put up with long migrations and generally > > > longer > > > > queries, or are y'all doing periodic cleanups of your metadata DB to > > keep > > > > it fairly light? > > > > > > > > Thanks, > > > > Matt Davis > > > > > > > > > > --00000000000088dc650576bd6dda--