From dev-return-5343-archive-asf-public=cust-asf.ponee.io@airflow.incubator.apache.org Tue Jun 5 23:11:44 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id D52F7180625 for ; Tue, 5 Jun 2018 23:11:43 +0200 (CEST) Received: (qmail 14932 invoked by uid 500); 5 Jun 2018 21:11:42 -0000 Mailing-List: contact dev-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list dev@airflow.incubator.apache.org Received: (qmail 14916 invoked by uid 99); 5 Jun 2018 21:11:42 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Jun 2018 21:11:42 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id B30A5C00C8 for ; Tue, 5 Jun 2018 21:11:41 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.97 X-Spam-Level: * X-Spam-Status: No, score=1.97 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=fathomhealth-co.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id KwS1XDFfPwNC for ; Tue, 5 Jun 2018 21:11:38 +0000 (UTC) Received: from mail-qt0-f193.google.com (mail-qt0-f193.google.com [209.85.216.193]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 09C475F124 for ; Tue, 5 Jun 2018 21:11:38 +0000 (UTC) Received: by mail-qt0-f193.google.com with SMTP id h5-v6so4076448qtm.13 for ; Tue, 05 Jun 2018 14:11:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fathomhealth-co.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=RlAc2dwAb8zTZyo7O+Ce8mxbif1qCFvSJpej1tKuCG4=; b=N/KxGrFBBZmy6jvLt4KKhI97Xley+PdCpjFN5ma/NbCC9c6WdGUFF9IQt2w6C6rReA 5cTuwJlkIguRqJPozSYU+def8yTopX1H/ToVFs4RKbPmOwcETZUBemUWHGNkSwnUdUw3 YTlYdDr3sgC19U/nGsucTr9RDcgucsdEhtmzl3xAm8JRkZ8zI9xQ60VnzmzkRIqUD0AX ECu4IZ6PgXdYeJ7xi70Tf04jeS8WS/puRPYL1JdoXuOD3Mkid5onfDkoN8RLLltPlmoe GmCRdFwyHOVe/cG+104w8R49UhNoqILtdNnzXf9N37TyQJ+sy4oUHovpMlnSVsLUwgv6 XFTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=RlAc2dwAb8zTZyo7O+Ce8mxbif1qCFvSJpej1tKuCG4=; b=kark3Vf49NDkMaTgK5w+n3+NnT5+2ddCgs7LFASMW6e+0XSQnshmQr3r9lnF4m6JU6 kwDfyy6xUwnECbzffNdrDObeqJ1u9HCkeeqyaFVTYDzSECcgtaOVQUwiNP1snEAKygfZ 7s5RHIq7UZPPusB9OKhuy0Lpw6nNPgyFIpBqG9dv/KG1sGMW/jGWNqh+TPV3JD0piDKb /u0/0EJcr8gzsMK6YfzDBTstuZ5u2GmYrEQBQO9nfnPZWtiPQSaFp8No3tDziVvKLXqY d691SxMKDsZt2LGhRCWmCCE6bPFS+NGbnSi/uwdbolC6Kemgf404yg2yjtlBwZ7bXxy9 yi7g== X-Gm-Message-State: APt69E0vmcC0OZ96jMd/Co1LkP6VmMNbCo0mzk9BICP/Gif0b/caxi2g w7qionlQULs6vuSUERrXpsSOkABS5C5i1i0k+qSLJAL7 X-Google-Smtp-Source: ADUXVKIEiQbJ9FbZcgv8noJMfrPcQidqtA6qpVSXHzq7vbWnnpILmsMbQ0G31RHvUnFt3Q52i4NpKLxsP6CkuOQQflg= X-Received: by 2002:ac8:368:: with SMTP id w40-v6mr291282qtg.191.1528233096735; Tue, 05 Jun 2018 14:11:36 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac8:1a:0:0:0:0:0 with HTTP; Tue, 5 Jun 2018 14:11:36 -0700 (PDT) In-Reply-To: References: From: Christopher Bockman Date: Tue, 5 Jun 2018 14:11:36 -0700 Message-ID: Subject: Re: PSA: Make sure your Airflow instance isn't public and isn't Google indexed To: dev@airflow.incubator.apache.org Content-Type: multipart/alternative; boundary="000000000000dd8205056deb7ec6" --000000000000dd8205056deb7ec6 Content-Type: text/plain; charset="UTF-8" +1 to being able to disable--we have authentication in place, but use a separate solution that (probably?) Airflow won't realize is enabled, so having a continuous giant warning banner would be rather unfortunate. On Tue, Jun 5, 2018 at 2:05 PM, Alek Storm wrote: > This is a great idea, but we'd appreciate a setting that disables the > banner even if those conditions aren't met - our instance is deployed > without authentication, but is only accessible via our intranet. > > Alek > > > On Tue, Jun 5, 2018, 3:35 PM James Meickle > wrote: > > > I think that a banner notification would be a fair penalty if you access > > Airflow without authentication, or have API authentication turned off, or > > are accessing via http:// with a non-localhost `Host:`. (Are there any > > other circumstances to think of?) > > > > I would also suggest serving a default robots.txt to mitigate accidental > > indexing of public instances (as most public instances will be > accidentally > > public, statistically speaking). If you truly want your Airflow instance > > public and indexed, you should have to go out of your way to permit that. > > > > On Tue, Jun 5, 2018 at 1:51 PM, Maxime Beauchemin < > > maximebeauchemin@gmail.com> wrote: > > > > > What about a clear alert on the UI showing when auth is off? Perhaps a > > > large red triangle-exclamation icon on the navbar with a tooltip > > > "Authentication is off, this Airflow instance in not secure." and > > clicking > > > take you to the doc's security page. > > > > > > Well and then of course people should make sure their infra isn't open > to > > > the Internet. We really shouldn't have to tell people to keep their > > > infrastructure behind a firewall. In most environments you have to do > > quite > > > a bit of work to open any resource up to the Internet (SSL certs, > special > > > security groups for load balancers/proxies, ...). Now I'm curious to > > > understand how UMG managed to do this by mistake... > > > > > > Also a quick reminder to use the Connection abstraction to store > secrets, > > > ideally using the environment variable feature. > > > > > > Max > > > > > > On Tue, Jun 5, 2018 at 10:02 AM Taylor Edmiston > > > wrote: > > > > > > > One of our engineers wrote a blog post about the UMG mistakes as > well. > > > > > > > > https://www.astronomer.io/blog/universal-music-group-airflow-leak/ > > > > > > > > I know that best practices are well known here, but I second James' > > > > suggestion that we add some docs, code, or config so that the > framework > > > > optimizes for being (nearly) production-ready by default and not just > > > easy > > > > to start with for local dev. Admittedly this takes some work to not > > add > > > > friction to the local onboarding experience. > > > > > > > > Do most people keep separate airflow.cfg files per environment like > > > what's > > > > considered the best practice in the Django world? e.g. > > > > https://stackoverflow.com/q/10664244/149428 > > > > > > > > Taylor > > > > > > > > *Taylor Edmiston* > > > > Blog | CV > > > > | LinkedIn > > > > | AngelList > > > > | Stack Overflow > > > > > > > > > > > > > > > > On Tue, Jun 5, 2018 at 9:57 AM, James Meickle < > jmeickle@quantopian.com > > > > > > > wrote: > > > > > > > > > Bumping this one because now Airflow is in the news over it... > > > > > > > > > > https://www.bleepingcomputer.com/news/security/contractor- > > > > > exposes-credentials-for-universal-music-groups-it- > > > > > infrastructure/?utm_campaign=Security%2BNewsletter&utm_ > > > > > medium=email&utm_source=Security_Newsletter_co_79 > > > > > > > > > > On Fri, Mar 23, 2018 at 9:33 AM, James Meickle < > > > jmeickle@quantopian.com> > > > > > wrote: > > > > > > > > > > > While Googling something Airflow-related a few weeks ago, I > noticed > > > > that > > > > > > someone's Airflow dashboard had been indexed by Google and was > > > > accessible > > > > > > to the outside world without authentication. A little more > Googling > > > > > > revealed a handful of other indexed instances in various states > of > > > > > > security. I did my best to contact the operators, and waited for > > > > > responses > > > > > > before posting this. > > > > > > > > > > > > Airflow is not a secure project by default ( > > > https://issues.apache.org/ > > > > > > jira/browse/AIRFLOW-2047), and you can do all sorts of mean > things > > to > > > > an > > > > > > instance that hasn't been intentionally locked down. (And even > > then, > > > > you > > > > > > shouldn't rely exclusively on your app's authentication for > > providing > > > > > > security.) > > > > > > > > > > > > Having "internal" dashboards/data sources/executors exposed to > the > > > web > > > > is > > > > > > dangerous, since old versions can stick around for a very long > > time, > > > > help > > > > > > compromise unrelated deployments, and generally just create very > > bad > > > > > press > > > > > > for the overall project if there's ever a mass compromise (see: > > Redis > > > > and > > > > > > MongoDB). > > > > > > > > > > > > Shipping secure defaults is hard, but perhaps we could add best > > > > practices > > > > > > like instructions for deploying a robots.txt with Airflow? Or an > > > impact > > > > > > statement about what someone could do if they access your Airflow > > > > > instance? > > > > > > I think that many people deploying Airflow for the first time > might > > > not > > > > > > realize that it can get indexed, or how much damage someone can > > cause > > > > via > > > > > > accessing it. > > > > > > > > > > > > > > > > > > > > > --000000000000dd8205056deb7ec6--