Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A74C1200B82 for ; Fri, 16 Sep 2016 08:26:01 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A5C58160AC4; Fri, 16 Sep 2016 06:26:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C415E160A8C for ; Fri, 16 Sep 2016 08:26:00 +0200 (CEST) Received: (qmail 67958 invoked by uid 500); 16 Sep 2016 06:25:59 -0000 Mailing-List: contact dev-help@apex.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.apache.org Delivered-To: mailing list dev@apex.apache.org Received: (qmail 67946 invoked by uid 99); 16 Sep 2016 06:25:59 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Sep 2016 06:25:59 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 3FC2818064A for ; Fri, 16 Sep 2016 06:25:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.279 X-Spam-Level: * X-Spam-Status: No, score=1.279 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=datatorrent-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 2VdbtLXANKO7 for ; Fri, 16 Sep 2016 06:25:57 +0000 (UTC) Received: from mail-qt0-f173.google.com (mail-qt0-f173.google.com [209.85.216.173]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id DF18D5F2F2 for ; Fri, 16 Sep 2016 06:25:56 +0000 (UTC) Received: by mail-qt0-f173.google.com with SMTP id 93so38809362qtg.2 for ; Thu, 15 Sep 2016 23:25:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datatorrent-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=g1vCTwFzabfDsbcJGpGO8AHybCqEObN4WYd4rWQuqlU=; b=biogFV0eMdoVQDyDgB52d/i8t10+OVpXowov1zsB8qCt7/b5xy4nF4CR8WHw4Pvpth wG/82xm5DHT9FQFK3Q3+zE9hTsuE83/VROY0Eq5w1ASZrLFMjk4iplhbizWGz9F8tT8o jngeNe4hPMV8x9mlhpWvcXpxNMnyXo1I0E3soG3Qk4EOf1AE54DhENHFBzENvdVhk3AI gWqztMhTihcK+6ysnhDEzd6yMe5oZFw7xjPzYmXhk6A51DeRdxaLFVnMTVah0Pml6rly Z7Y6UWtiz3ySodC8EJmU6X/WHm8EHeYsMva6YPVtG2giD/0T7/X1jFKdjhusNNzCgCQW 4vHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=g1vCTwFzabfDsbcJGpGO8AHybCqEObN4WYd4rWQuqlU=; b=KugLxltj5yGG5ThQ9sHL5T1zoUCshcjjjXVOpsQoSjtQQC9INKcpQjGETbLuNAism+ jSoQJ1GEETJxF1RNA72tFmG8Zqy6NJrI6XGoc5hRR29yHN/Vo7e2Zf6iX358xCs3cegT Iyc4bVdniCzEdn/VPVCRY2QGVjYyPevReZxI0v7SwqbUBPJtBuFUafdWA6/Ec4MlInNm RwW33tsELJgUwJF7QwhceIvBE6Ozx7XKW9zPetfDx0VeYTevSpQoBIwo4mRYeE84Y0GT aeCHifea6QZIZJ2ybE/NNbRBRwM4RjFyzwt4Eklck2NWT0db4g2VvQCleSrhU1xBaSUb VAag== X-Gm-Message-State: AE9vXwN7znEOODMT2EnwXrHBaDWJVB+RKrh4FTxvYMJTwzg3+6cnrGuJZEvLBXitySuZ/BHY3oi3v5CZakBGqyXB X-Received: by 10.237.32.205 with SMTP id 71mr14133797qtb.22.1474007149870; Thu, 15 Sep 2016 23:25:49 -0700 (PDT) MIME-Version: 1.0 Received: by 10.55.45.129 with HTTP; Thu, 15 Sep 2016 23:25:48 -0700 (PDT) Received: by 10.55.45.129 with HTTP; Thu, 15 Sep 2016 23:25:48 -0700 (PDT) In-Reply-To: References: From: David Yan Date: Thu, 15 Sep 2016 23:25:48 -0700 Message-ID: Subject: Re: Python support To: dev@apex.apache.org Content-Type: multipart/alternative; boundary=94eb2c0c5e66909961053c9a0871 archived-at: Fri, 16 Sep 2016 06:26:01 -0000 --94eb2c0c5e66909961053c9a0871 Content-Type: text/plain; charset=UTF-8 On a very high level, we can build a Python framework in Apex by having a Python binding on our high level API that generates Jython operators with the business logic written by users in Python, along with existing connectors. David On Sep 15, 2016 11:00 PM, "Chinmay Kolhatkar" wrote: > Strongly +1 on this. One thing that proves this is useful for Apex is > hadoop streaming where python is used write map-reduce jobs. This not only > will increase the reach in development world but also would be appealing to > administrators to write an app as they are usually aware of python. > > > Few suggestions (not in specific order): > 1. As a part of supporting python execution in operator code, we should > provide a complete lifecycle of an operator to be specified from python. > > 2. I would personally not worry about providing python binding for low > level apex client APIs like addOperator, addStream etc... If one has to do > it, I think its best to use JAVA api as the most power of those low level > APIs can be leveraged there. > > 3. For client APIs, I would rather suggest we focus on high level APIs like > apex stream API (malhar-stream). We should provide a complete python > binding for them. Python is very useful when it comes to functional > programming and Stream API provide exactly that. > > 4. Thinking very high level, I don't think we need any change in apex-core > for this. This could be another project in malhar itself. There are python > libraries like py4j or pyjnius or JPype which allows to access Java objects > from python. > Basically, we just need to establish a right bridge betweeen java and > python VM. We need to be thoughtful about performance as these bridges > across programming languages are costly. > > 5. We need to decide on how the code execution will look like on this. For > eg., should a py file be an alternative to Application.java in the package? > This means, the starting point is apex cli i.e. java. Hence instead of > finding classes implementing StreamingApplication, apexcli needs to find py > file which defines definition of DAG. > OR should the flow start with "__main__" of python file and end up in Java? > > 6. This might be too early, but it important to emphasis that we need to > plan for writing examples and documentation for python binding. > > -Chinmay. > > > > On Fri, Sep 16, 2016 at 2:36 AM, Thomas Weise wrote: > > > Hi, > > > > Python (not Jython) seems to be a popular language and frequently used > for > > data analysis, especially where flexibility matters. It has a > comprehensive > > library and it is generally considered low barrier to entry. I have also > > seen Python used in critical back-end components, although that's > probably > > not very common? > > > > I think Python support could potentially expand the user base for Apex. > > There are 2 main areas that can be considered: > > > > 1) Support to execute Python code through an operator > > 2) A client API that lets users construct pipelines in Python > > > > The former can exist without the latter. And it would enable users to > > leverage existing code that otherwise would have to be rewritten in a JVM > > language. The engine could ship scripts/packages so they are > automatically > > distributed on the cluster. > > > > A useful client API probably requires back-end support for lambda > functions > > and more complex UDFs. > > > > Would be great to get some feedback, especially from those that have > > experience with Python, on how an integration could potentially open up > new > > use cases for Apex. > > > > Thanks, > > Thomas > > > --94eb2c0c5e66909961053c9a0871--