Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 0A972200C4A for ; Sun, 19 Mar 2017 03:48:50 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id EEF0C160B8D; Sun, 19 Mar 2017 02:48:49 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 44CFD160B7F for ; Sun, 19 Mar 2017 03:48:49 +0100 (CET) Received: (qmail 4075 invoked by uid 500); 19 Mar 2017 02:48:46 -0000 Mailing-List: contact dev-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list dev@airflow.incubator.apache.org Received: (qmail 4062 invoked by uid 99); 19 Mar 2017 02:48:46 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Mar 2017 02:48:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 7EC391A03F1 for ; Sun, 19 Mar 2017 02:48:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.898 X-Spam-Level: X-Spam-Status: No, score=0.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Wtuc_Hq0R3N8 for ; Sun, 19 Mar 2017 02:48:43 +0000 (UTC) Received: from mail-pf0-f179.google.com (mail-pf0-f179.google.com [209.85.192.179]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 6EB095F297 for ; Sun, 19 Mar 2017 02:48:42 +0000 (UTC) Received: by mail-pf0-f179.google.com with SMTP id p189so29292066pfp.1 for ; Sat, 18 Mar 2017 19:48:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:subject:date:references :to:in-reply-to:message-id; bh=JTq34KoAgKd/TRHsLP9X74HZ1cu0XwVwk4/VF8qQx3A=; b=meeszhAOO3a7HNTJLYUyrtNZafwVxZblHPq38zdxWQGZN8vnrgThwnZhlt/629lJ86 E/qNNN1vpS4Vb/683kAFdOkGSAAxnmgaEoCQlCwjASteBY4ZbdFnXqkKXz6VRsCNcP1J XXQpGSNgz6Rai9gfdoGe//DQAgSpUcP3bQc7brK77XKYHFmBAitP1aEzKeS5mn02mNu1 TJLr8gmcM3FMPMNo546DeqakVv2/ILLc+qcN4AG9cTQVXp6cNbUOXBp9rsBDkOrX98P3 npBq/Iyb/2j6Gf9i3rC5czNQN1wtMmzhaKDyYOlLPzz+59hut9sQfAItFSkO1+iXUp58 eqgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=JTq34KoAgKd/TRHsLP9X74HZ1cu0XwVwk4/VF8qQx3A=; b=oKoMULzI6aO/QzuN9icU/ckHvQkER3urFFjPvcIs10wPaLkYnBtgTUdwpLK17ZWfKK Dv5baRxTyWwhl2arlwpzxciILENW6yIrTF2jfTKLdWkkxaB4JBl6RWU9TbNVf+4wONtp JaFAwBBHRTJk+QUevoN1GdwtNlSBeMsp+mXHST0etuAF+vdG2e0o0cJ1rnyx1EPMcBwa HufoFNUtL2sRdMlouEjPz82rMps8k6xpUvtJ7HeE6WDGP5/FXjG2JYtKP8DyfyAJdmpJ nRlQmZ/Ln+jyg/EUZhTBtuVqMhwja0xD+AN4CD+UTqFx7eroZB6fZr4malh6/SMExt8E cEXw== X-Gm-Message-State: AFeK/H2R5zABh8e8WBiVFa3Gv6IMIf56KZJ4KhCvS+Wr26zjdi/vDpMX8pP5hCfuEwZUkw== X-Received: by 10.84.173.4 with SMTP id o4mr30914264plb.106.1489890273845; Sat, 18 Mar 2017 19:24:33 -0700 (PDT) Received: from [10.30.46.41] ([12.31.247.227]) by smtp.gmail.com with ESMTPSA id a78sm24894146pfc.25.2017.03.18.19.24.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 18 Mar 2017 19:24:33 -0700 (PDT) From: Bolke de Bruin Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: Re: SparkOperator - tips and feedback? Date: Sat, 18 Mar 2017 19:24:31 -0700 References: To: dev@airflow.incubator.apache.org In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3259) archived-at: Sun, 19 Mar 2017 02:48:50 -0000 A spark operator exists as of 1.8.0 (which will be released tomorrow), = you might want to take a look at that. I know that an update is coming = to that operator that improves communication with Yarn. Bolke > On 18 Mar 2017, at 18:43, Russell Jurney = wrote: >=20 > Ruslan, thanks for your feedback. >=20 > You mean the spark-submit context? Or like the SparkContext and > SparkSession? I don't think we could keep that alive, because it = wouldn't > work out with multiple calls to spark-submit. I do feel your pain, = though. > Maybe someone else can see how this might be done? >=20 > If SparkContext was able to open the spark/pyspark console, then = multiple > job submissions would be possible. I didn't have this in mind but an > InteractiveSparkContext or SparkConsoleContext might be able to do = this? >=20 > Russell Jurney @rjurney > russell.jurney@gmail.com LI FB > datasyndrome.com >=20 > On Sat, Mar 18, 2017 at 3:02 PM, Ruslan Dautkhanov = > wrote: >=20 >> +1 Great idea. >>=20 >> my two cents - it would be nice (as an option) if SparkOperator would = be >> able to keep context open between different calls, >> as it takes 30+ seconds to create a new context (on our cluster). Not = sure >> how well it fits Airflow architecture. >>=20 >>=20 >>=20 >> -- >> Ruslan Dautkhanov >>=20 >> On Sat, Mar 18, 2017 at 3:45 PM, Russell Jurney = >> wrote: >>=20 >>> What do people think about creating a SparkOperator that uses >> spark-submit >>> to submit jobs? Would work for Scala/Java Spark and PySpark. The = patterns >>> outlined in my presentation on Airflow and PySpark >>> would fit well inside an Operator, I >>> think. >>> BashOperator works, but why not tailor something to spark-submit? >>>=20 >>> I'm open to doing the work, but I wanted to see what people though = about >> it >>> and get feedback about things they would like to see in = SparkOperator and >>> get any pointers people had to doing the implementation. >>>=20 >>> Russell Jurney @rjurney >>> russell.jurney@gmail.com LI = FB >>> datasyndrome.com >>>=20 >>=20