From dev-return-22625-archive-asf-public=cust-asf.ponee.io@apex.apache.org Sat Feb 3 19:02:19 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id C6DA3180621 for ; Sat, 3 Feb 2018 19:02:19 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id B6930160C38; Sat, 3 Feb 2018 18:02:19 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B0E1F160C26 for ; Sat, 3 Feb 2018 19:02:18 +0100 (CET) Received: (qmail 61373 invoked by uid 500); 3 Feb 2018 18:02:17 -0000 Mailing-List: contact dev-help@apex.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.apache.org Delivered-To: mailing list dev@apex.apache.org Received: (qmail 61360 invoked by uid 99); 3 Feb 2018 18:02:17 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 03 Feb 2018 18:02:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id B2411C0238 for ; Sat, 3 Feb 2018 18:02:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.021 X-Spam-Level: X-Spam-Status: No, score=-0.021 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=datatorrent-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id tJborwfbXBrm for ; Sat, 3 Feb 2018 18:02:13 +0000 (UTC) Received: from mail-pl0-f50.google.com (mail-pl0-f50.google.com [209.85.160.50]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id D47B55F397 for ; Sat, 3 Feb 2018 18:02:12 +0000 (UTC) Received: by mail-pl0-f50.google.com with SMTP id ay8so8749237plb.4 for ; Sat, 03 Feb 2018 10:02:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datatorrent-com.20150623.gappssmtp.com; s=20150623; h=from:content-transfer-encoding:mime-version:subject:date:references :to:in-reply-to:message-id; bh=aMoCZiQmi4BWLyqFGh8jRwA5OuMkpomcarBbiqDazV0=; b=NEQp+dB7vimI7BOnmWnsZqrNfEkKSJOf6ADBj6HBv8TEZUQ1iNQN0VwCJfKUq8OYV6 3vWGoQ55xA/ONU2ItEWShHsUmTsu1UR4b9+IbWGI2on2c2g6nRWU4ck4lMDn3L/0p3mL V2MZxutXZ0+qqL+tO2woMeeRzNK6EpPvrG/kOftPhAkVXC9uXDntrlGA1aQdjMlR9YPH pASCB5S5pmC/Thg+JtCdwkVR+Q4eSSzPy1Yf6urIUm+z9iCVac1cRjtklNC42d1mzOhD W2Mn03sa11Aq9zdWQX7k7uIgf0y7sSdj6uh+zdCWwnfn6XIv+S9U3WIva1Y3OiMIzWAM 0dAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=aMoCZiQmi4BWLyqFGh8jRwA5OuMkpomcarBbiqDazV0=; b=Sm6hhPaq+QruVbeq80cJ+AX1/EBA675qeeoiebld0HntuefpaF5ksWJbB1NJ3hNjDr RupeCNM+LRoRHVWZQlfypL2sZ1N13HIuerXawY9jlw+uCTOkmK5ReC+3hoswHY+//pMI rYoiiwI/fE1RpGexGBKMRYqX0uRmeQx6tHZ4fahcL4lrTaHnAJtBQOqh993Syt9CKyxy i/CnnQFllVCcswMelKesd49E8S6pkT43E/cLm6+Mum8PTDz/I/zmo0KeT55lbnfsuEWJ utuNruUqQUsPXOIuLRiFktT6PChtoLXhR/tIwviX8JRSh5S3ket+e4zKc5vGYxarV30+ anfA== X-Gm-Message-State: AKwxytdAQuQFYG6iYd/7J/CmmQMVOU8TEQdjad7GqBYwYl057J15UKmn TVimHv27HLHiHj9wjYIshgrh8JUEZe0= X-Google-Smtp-Source: AH8x226K+G5bY+sJUkUjgguh+fP2dUqYMzXt5h6zvB8hfkHe15I4OWA/cSUUKXQNDurGdt5FlWC9+w== X-Received: by 2002:a17:902:820c:: with SMTP id x12-v6mr3490034pln.103.1517680930460; Sat, 03 Feb 2018 10:02:10 -0800 (PST) Received: from pramods-mbp-2.home.net (c-24-130-233-191.hsd1.ca.comcast.net. [24.130.233.191]) by smtp.gmail.com with ESMTPSA id u25sm9024966pfh.142.2018.02.03.10.02.08 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 03 Feb 2018 10:02:08 -0800 (PST) From: Pramod Immaneni Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 11.1 \(3445.4.7\)) Subject: Re: [Proposal] Extension of the Apex configuration to add dependent jar files in runtime. Date: Sat, 3 Feb 2018 10:02:07 -0800 References: <1d5904f7-3d2d-e148-b547-450333ff1392@apache.org> <9da2c77a-b9fd-3a9e-7cca-ea7794a7bf4d@apache.org> To: dev@apex.apache.org In-Reply-To: <9da2c77a-b9fd-3a9e-7cca-ea7794a7bf4d@apache.org> Message-Id: <054E26E2-0D2F-4770-8C1C-7348048FE7F8@datatorrent.com> X-Mailer: Apple Mail (2.3445.4.7) Yes generic in the Attribute class > On Feb 3, 2018, at 10:00 AM, Vlad Rozov wrote: >=20 > +1 assuming that support for merge/override will be generic for all = attributes that support list/set of values and not limited to = LIBRARY_JARS attribute only. >=20 > Thank you, >=20 > Vlad >=20 > On 2/3/18 09:13, Pramod Immaneni wrote: >> I too agree that the discussion has veered off from the original = topic. Why >> can't LIBRARY_JARS be used for this, albeit with a minor improvement? >> Currently, our attribute layering is an override, so if you have an >> attribute that is specified as = apex.application..attr. >> it overrides apex.attr. for that application. What if were = to >> expand the attribute definition to allow for the specification of how = the >> layering of attributes will be combined, override being one option, = merge >> being another with these being implemented with a combiner interface? = This >> way a set of common jars could be specified using = dt.attr.LIBRARY_JARS and >> applications can still add extra jars on top. >>=20 >> On Fri, Feb 2, 2018 at 6:32 PM, Vlad Rozov wrote: >>=20 >>> IMO, support for Kubernetes, Docker images, Mesos and anything = outside of >>> Yarn deployments is a topic by itself and design for such support = needs to >>> be discussed. I do not want to propose any specific design, but = assume that >>> logic to create proper execution environment would be coded into = Apex >>> client. Whether it (hardcoded logic to create an execution = environment) can >>> be expressed simply as a list of dependent classes or jars is at = minimum >>> questionable. Until design is proposed and agreed upon, I'd prefer = to use >>> plugins for the subject. >>>=20 >>> Thank you, >>>=20 >>> Vlad >>>=20 >>>=20 >>> On 2/2/18 13:17, Sanjay Pujare wrote: >>>=20 >>>> In cases where we have an "=C3=BCber" docker image containing = support for >>>> multiple execution environments it might be useful for the Apex = core to >>>> infer what kind of execution environment to use for a particular >>>> invocation (say based on configuration values/environment = variables) and >>>> in that case the core will load the corresponding libraries. And I = think >>>> this kind of flexibility or support would be difficult through the = plugins >>>> hence I think Sergey's proposal will be useful. >>>>=20 >>>> Sanjay >>>>=20 >>>>=20 >>>> On Fri, Feb 2, 2018 at 11:18 AM, Sergey Golovko = >>>> wrote: >>>>=20 >>>> Unfortunately the moving of .apa file to a docker image cannot = resolve all >>>>> problems with the dependencies. If we assume an Apex application = should >>>>> be >>>>> run in different execution environments, the application docker = image >>>>> must >>>>> contain all possible execution environment dependencies. >>>>>=20 >>>>> I think the better way is to assume that the original application = docker >>>>> image like the current .apa file should contain the application = specific >>>>> dependencies only. And some smart client tool should create the >>>>> executable >>>>> application docker image form the original one and include the = execution >>>>> specific environment dependencies into the target application = docker >>>>> image. >>>>> It means anyway an smart client Apex tool should have an interface = to >>>>> define different environment dependencies or combination of = different >>>>> dimensions of the environment dependencies. >>>>>=20 >>>>> Thanks, >>>>> Sergey >>>>>=20 >>>>>=20 >>>>> On Fri, Feb 2, 2018 at 10:23 AM, Thomas Weise = wrote: >>>>>=20 >>>>> The current dependencies are based on how Apex YARN client works. = YARN >>>>>> depends on a DFS implementation for deployment (not necessarily = HDFS). >>>>>>=20 >>>>>> I think a better way to look at this is to consider that instead = of an >>>>>>=20 >>>>> .apa >>>>>=20 >>>>>> file the application is a docker image, which would contain Apex = and all >>>>>> dependencies that the "StramClient" today adds for YARN. >>>>>>=20 >>>>>> In that world there would be no Apex CLI or Apex specific client. >>>>>>=20 >>>>>> Thomas >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> On Thu, Feb 1, 2018 at 5:57 PM, Sergey Golovko = >>>>>> wrote: >>>>>>=20 >>>>>> I agree. It can be implemented with usage of plugins. But if I = need to >>>>>>> enable and configurate the plugin I need to put this information = into >>>>>>> dt-site.xml. It means The plugin and its parameter must be = documented >>>>>>>=20 >>>>>> and >>>>>> the list of the added specific jars will be visible and available = for >>>>>>> updates to the end-user. The implementation via plugins is more = dynamic >>>>>>> solution that is more convenient for the application developers. = But >>>>>>>=20 >>>>>> I'm >>>>>> talking about the static configuration of the Apex build or >>>>>> installation >>>>>> that relates more to the platform development. >>>>>>> The current Apex core implementation uses the static unchanged = list of >>>>>>>=20 >>>>>> jars >>>>>>=20 >>>>>>> for long time, because the Apex implementation still contains = several >>>>>>>=20 >>>>>> basic >>>>>>=20 >>>>>>> static assumptions (for instance, the usage of YARN, HDSF, = etc.). And >>>>>>>=20 >>>>>> the >>>>>> current Apex assumptions are hardcoded in the implementation. But = if we >>>>>> are >>>>>>=20 >>>>>>> going to improve Apex and use Java interfaces in generic Apex >>>>>>> implementation, the current static approach in Apex code to = hardcode a >>>>>>>=20 >>>>>> list >>>>>>=20 >>>>>>> of dependent jars will not work anymore. It will require to = include a >>>>>>>=20 >>>>>> new >>>>>> solution to add/change jars in specific Apex = builds/configurations. >>>>>> And I >>>>>> don't think the usage of the plugins will be good for that. >>>>>>> Thanks, >>>>>>> Sergey >>>>>>>=20 >>>>>>>=20 >>>>>>> On Thu, Feb 1, 2018 at 1:47 PM, Vlad Rozov = wrote: >>>>>>>=20 >>>>>>> There is a way to get the same end result by using plugins. It = will >>>>>>> be >>>>>> good to understand why plugin can't be used and can they be = extended >>>>>>> to >>>>>> provide the required functionality. >>>>>>>> Thank you, >>>>>>>>=20 >>>>>>>> Vlad >>>>>>>>=20 >>>>>>>>=20 >>>>>>>> On 1/29/18 15:14, Sergey Golovko wrote: >>>>>>>>=20 >>>>>>>> Hello All, >>>>>>>>> In Apex there are two ways to deploy non-Hadoop jars to the = deployed >>>>>>>>> cluster. >>>>>>>>>=20 >>>>>>>>> The first approach is static (hardcoded) and it is used by = Apex >>>>>>>>>=20 >>>>>>>> platform >>>>>>> developers only. There are several final static arrays of Java >>>>>>>> classes >>>>>> in StramClient.java >>>>>>>>> that define which of the available jars should be included = into >>>>>>>>>=20 >>>>>>>> deployment >>>>>>>> for every Apex application. >>>>>>>>> The second approach is to add paths of all dependent jar-files = to >>>>>>>>>=20 >>>>>>>> the >>>>>> value >>>>>>>>> of the attribute LIB_JARS. The end-user can set/update the = value of >>>>>>>>>=20 >>>>>>>> the >>>>>>> attribute LIB_JARS via dt-site.xml files, command line = parameters, >>>>>>>>> application properties and plugins. The usage of the >>>>>>>>> attribute LIB_JARS is the official documented way for all Apex = users >>>>>>>>>=20 >>>>>>>> to >>>>>>> manage by the deployment jars. >>>>>>>>> But some of the dependent jars (not from the Apex core) can be >>>>>>>>>=20 >>>>>>>> common >>>>>> for >>>>>>>> all customer's applications for a specific installation and/or >>>>>>>> execution >>>>>>> environment. Unfortunately the Apex implementation does not = contain >>>>>>>> the >>>>>>> middle solution that would allow the Apex developers and = customer >>>>>>>> support >>>>>>>> to >>>>>>>>> define and add new dependent jar-files (jars that should not = be >>>>>>>>> configurable/managed by the end-user) without the >>>>>>>>>=20 >>>>>>>> updates/recompilation >>>>>>> of >>>>>>>=20 >>>>>>>> the Apex Java code during the Apex building process and/or >>>>>>>>> installation/configuration. >>>>>>>>>=20 >>>>>>>>> Also the having of such kind of flexibility would allow the = Apex >>>>>>>>>=20 >>>>>>>> core >>>>>> developers to use Java interfaces during the development to = define >>>>>>>> an >>>>>> abstraction layer in Apex implementation and configurate Apex = core >>>>>>>> to >>>>>> add >>>>>>>> some specific jars to all Apex applications without = recompilation of >>>>>>>> the >>>>>>> Apex source code. >>>>>>>>> For instance, now the usage of HDFS is hardcoded in Apex = platform >>>>>>>>>=20 >>>>>>>> code >>>>>> but >>>>>>>> it can be replaced with any other distributed or cloud base = file >>>>>>>> system. >>>>>>> The Apex core code can use an interface for all I/O operations = but >>>>>>>> the >>>>>> supporting of a real specific file system implementation can be >>>>>>>> added >>>>>> as >>>>>>=20 >>>>>>> an >>>>>>>>> independent jar-file. Or if the implementation of some of Apex >>>>>>>>>=20 >>>>>>>> operators >>>>>>> depend on a specific service, and it is necessary to add some of = the >>>>>>>>> service jars to every Apex application implicitly. >>>>>>>>>=20 >>>>>>>>> The proposal: >>>>>>>>>=20 >>>>>>>>> - add a predefined configuration text file (we can make any = choice >>>>>>>>>=20 >>>>>>>> for >>>>>> the >>>>>>>> file syntax: XML, JSON or Properties) to Apex engine resources = with >>>>>>>>> predefined values of some of the Apex attributes (now we can = include >>>>>>>>> LIB_JARS >>>>>>>>> attribute only); >>>>>>>>> - allow to have a configuration text file with the same >>>>>>>>>=20 >>>>>>>> functionality >>>>>> in >>>>>>=20 >>>>>>> the Apex installation folder "conf"; >>>>>>>>> - read the content of the predefined configuration text files = by the >>>>>>>>>=20 >>>>>>>> stram >>>>>>>> client in runtime and add the jars to the list of the dependent >>>>>>>> jars; >>>>>> - allow to use paths to jars and Java classes to refer to the >>>>>>>> dependent >>>>>>> jars (the references can have the extensions: .class and .jar). >>>>>>>>> Thanks, >>>>>>>>> Sergey >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>>=20 >=20