Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2C5AF200B92 for ; Wed, 28 Sep 2016 21:01:34 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 2AD00160AD3; Wed, 28 Sep 2016 19:01:34 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EBB2C160AB8 for ; Wed, 28 Sep 2016 21:01:32 +0200 (CEST) Received: (qmail 95960 invoked by uid 500); 28 Sep 2016 19:01:32 -0000 Mailing-List: contact dev-help@systemml.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@systemml.incubator.apache.org Delivered-To: mailing list dev@systemml.incubator.apache.org Received: (qmail 95937 invoked by uid 99); 28 Sep 2016 19:01:31 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Sep 2016 19:01:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 02AEA1889B4 for ; Wed, 28 Sep 2016 19:01:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.299 X-Spam-Level: * X-Spam-Status: No, score=1.299 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, TVD_FW_GRAPHIC_NAME_MID=0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 6GPPeAVqrrcj for ; Wed, 28 Sep 2016 19:01:27 +0000 (UTC) Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 75A2460E2F for ; Wed, 28 Sep 2016 19:01:26 +0000 (UTC) Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id u8SIwJVr095071 for ; Wed, 28 Sep 2016 15:01:25 -0400 Received: from e19.ny.us.ibm.com (e19.ny.us.ibm.com [129.33.205.209]) by mx0a-001b2d01.pphosted.com with ESMTP id 25rj77v5q1-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 28 Sep 2016 15:01:24 -0400 Received: from localhost by e19.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 28 Sep 2016 15:01:23 -0400 Received: from d01dlp01.pok.ibm.com (9.56.250.166) by e19.ny.us.ibm.com (146.89.104.206) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 28 Sep 2016 15:01:21 -0400 X-IBM-Helo: d01dlp01.pok.ibm.com X-IBM-MailFrom: frreiss@us.ibm.com X-IBM-RcptTo: dev@systemml.incubator.apache.org Received: from b01cxnp22035.gho.pok.ibm.com (b01cxnp22035.gho.pok.ibm.com [9.57.198.25]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 1299538C804F for ; Wed, 28 Sep 2016 15:01:20 -0400 (EDT) Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u8SJ0iTR9961762 for ; Wed, 28 Sep 2016 19:01:19 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8ED38AE05C for ; Wed, 28 Sep 2016 15:01:19 -0400 (EDT) Received: from d50lp02.ny.us.ibm.com (unknown [146.89.104.208]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTPS id 77626AE03B for ; Wed, 28 Sep 2016 15:01:19 -0400 (EDT) Received: from localhost by d50lp02.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 28 Sep 2016 15:01:19 -0400 Received: from smtp.notes.na.collabserv.com (192.155.248.82) by d50lp02.ny.us.ibm.com (158.87.18.21) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128/128) Wed, 28 Sep 2016 15:01:16 -0400 Received: from localhost by smtp.notes.na.collabserv.com with smtp.notes.na.collabserv.com ESMTP for from ; Wed, 28 Sep 2016 19:01:15 -0000 Received: from us1a3-smtp02.a3.dal06.isc4sb.com (10.106.154.159) by smtp.notes.na.collabserv.com (10.106.227.105) with smtp.notes.na.collabserv.com ESMTP; Wed, 28 Sep 2016 19:01:13 -0000 Received: from us1a3-mail143.a3.dal06.isc4sb.com ([10.146.38.121]) by us1a3-smtp02.a3.dal06.isc4sb.com with ESMTP id 2016092819011238-421469 ; Wed, 28 Sep 2016 19:01:12 +0000 MIME-Version: 1.0 In-Reply-To: Subject: Re: Proof of Concept: Embedded Scala DSL To: dev@systemml.incubator.apache.org From: "Frederick R Reiss" Date: Wed, 28 Sep 2016 12:00:52 -0700 References: <1ffe92ae6a28438a3c376e207e03a69b@posteo.de> X-KeepSent: D7531D09:7AC62481-8825803C:0067EFF0; type=4; name=$KeepSent X-Mailer: IBM Notes Release 9.0.1EXT SHF692 April 27, 2016 X-LLNOutbound: False X-Disclaimed: 29611 X-TNEFEvaluated: 1 Content-type: multipart/related; Boundary="0__=07BB0AAFDFF469608f9e8a93df938690918c07BB0AAFDFF46960" x-cbid: 16092819-0056-0000-0000-0000017C1B74 X-IBM-ISS-SpamDetectors: Score=0.421136; BY=0.048538; FL=0; FP=0; FZ=0; HX=0; KW=0; PH=0; SC=0.421136; ST=0; TS=0; UL=0; ISC= X-IBM-ISS-DetailInfo: BY=3.00005823; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000185; SDB=6.00762380; UDB=6.00363334; UTC=2016-09-28 19:01:14 x-cbparentid: 16092819-5102-0000-0000-000001C27E51 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005823; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000185; SDB=6.00762379; UDB=6.00363334; IPR=6.00537472; BA=6.00004766; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012812; XFM=3.00000011; UTC=2016-09-28 19:01:22 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-09-28_11:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=5 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609020000 definitions=main-1609280331 archived-at: Wed, 28 Sep 2016 19:01:34 -0000 --0__=07BB0AAFDFF469608f9e8a93df938690918c07BB0AAFDFF46960 Content-type: multipart/alternative; Boundary="1__=07BB0AAFDFF469608f9e8a93df938690918c07BB0AAFDFF46960" --1__=07BB0AAFDFF469608f9e8a93df938690918c07BB0AAFDFF46960 Content-Transfer-Encoding: quoted-printable Content-type: text/plain; charset=US-ASCII Maybe I'm missing a subtle point here, but why not refactor the existing class org.apache.sysml.parser.DMLProgram into our common internal representation across DSLs? This class is already sufficiently expressive to represent any DML or PyDML program. Fred From: Niketan Pansare/Almaden/IBM@IBMUS To: dev@systemml.incubator.apache.org Date: 09/28/2016 11:20 AM Subject: Re: Proof of Concept: Embedded Scala DSL Thanks Felix for the response. +1 >> For the future design I will probably make the Matrix and Vector classes abstract which allows for different concrete implementations. We could then have one that is backed directly by SystemML and works similar to the Python DSL in that it just uses mock operators and builds the DML string that is then executed using SystemML. That way the deep embedding would reuse the shallow embedding and we could offer the user to either use the lazy MatrixType on the Repl or write code inside the macro. Also, I agree that we can postpone the IR and integration of different DSLs until the work on parallelize is completed. Thanks, Niketan Pansare IBM Almaden Research Center E-mail: npansar At us.ibm.com http://researcher.watson.ibm.com/researcher/view.php?person=3Dus-npansar fschueler---09/28/2016 10:54:37 AM---Hi Niketan, thanks for your suggestions! I thought about it a bit and here are my From: fschueler@posteo.de To: dev@systemml.incubator.apache.org Date: 09/28/2016 10:54 AM Subject: Re: Proof of Concept: Embedded Scala DSL Hi Niketan, thanks for your suggestions! I thought about it a bit and here are my ideas on it: The IR you are describing is basically already my user facing API. I am not sure how much sense it makes to have an IR that looks exactly like the API but with control structures renamed. A common IR for all DSLs definitely makes sense in general but I am not sure if it should be part of one particular DSL. For maintainability it might be better to have that IR somewhere on the SystemML side. Apart from that and to what Matthias suggested, I thought about how to make the DSL more suitable for using on the Repl and I think we can find a good compromise. Currently my API is backed by breeze for rapid prototyping where breeze just forces evaluation of every statement. For the future design I will probably make the Matrix and Vector classes abstract which allows for different concrete implementations. We could then have one that is backed directly by SystemML and works similar to the Python DSL in that it just uses mock operators and builds the DML string that is then executed using SystemML. That way the deep embedding would reuse the shallow embedding and we could offer the user to either use the lazy MatrixType on the Repl or write code inside the macro. I haven't started playing around with this idea but let me know what you think of it. The lazy, shallow DSL would basically do what you would want from a seperate IR, but i don't know if you want to call that from the python DSL. Felix Am 24.09.2016 19:39 schrieb Niketan Pansare: > Hi Felix, > > Thanks for the summary. The document is extremely useful. I > particularly like the idea of parallelizing the code with 'breeze' > library. I would like to pitch in few ideas which would enable your > code to be reused by other DSLs: > 1. Scala DSL/parallelize macro remains the same as described in your > documentation, but instead of generating DML directly, we call an > intermediate representation (IR). This IR then generates DML (instead > of generating DML directly by parallelize). This IR will be then > reused by Python DSL and R DSL. > 2. As an example, IR could be a lazy Matrix class (which would be part > of SystemML). It could have awkward syntax/mechanism for pushing down > control structures for example: beginWhile and endWhile. Since IR will > not be exposed to the end-user, it should be fine. > > Example: > https://github.com/apache/incubator-systemml/blob/master/src/main/python/sy= stemml/defmatrix.py#L537 > [1] will call IR's add() method. At the end of parallelize or when the > user wants result (i.e. eval() ), IR could generate DML code and > execute it. > > Again, this is just a proposal and am fine dropping the idea of > integrating different DSL if it makes the implementation of Scala DSL > complicated. Also, please feel free to correct me if I am missing > anything. > > Thanks, > > Niketan Pansare > IBM Almaden Research Center > E-mail: npansar At us.ibm.com > http://researcher.watson.ibm.com/researcher/view.php?person=3Dus-npansar > [2] > > Matthias Boehm---09/24/2016 01:11:36 AM---thanks for sharing the > summary - this is very nice. While looking over the example, I had the > follow > > From: Matthias Boehm/Almaden/IBM@IBMUS > To: dev@systemml.incubator.apache.org > Date: 09/24/2016 01:11 AM > Subject: Re: Proof of Concept: Embedded Scala DSL > > ------------------------- > > thanks for sharing the summary - this is very nice. While looking over > the example, I had the following questions: > > 1) Output handling: It would be great to see an example how the > results of Algorithm.execute() are consumed. Do you intend to hand out > our binary matrix representation or MLContext's Matrix from which the > user then requests specific output formats? Also if there are multiple > Algorithm instances, how is the MLContext (with its internal state of > lazily evaluated intermediates) reused? > > 2) Scala-breeze prototyping: How do you intend to support operations > that are not supported in breeze? Examples are removeEmpty, table, > aggregate, rowIndexMax, quantile/centralmoment, cummin/cummax, and DNN > operations? > > 3) Frame data type and operations: Do you also intend to add a frame > type and its operations? I think for this initial prototype it is not > necessarily required but please make the scope explicit. > > Regards, > Matthias > > fschueler---09/23/2016 04:36:14 PM---As discussed in the related Jira > (SYSTEMML-451) I have started to implement a prototype/proof of co > > From: fschueler@posteo.de > To: dev@systemml.incubator.apache.org > Date: 09/23/2016 04:36 PM > Subject: Proof of Concept: Embedded Scala DSL > > ------------------------- > > As discussed in the related Jira (SYSTEMML-451) I have started to > implement a prototype/proof of concept for an embedded DSL in Scala. > > I have summarized the current approach in a short document that you > can > find on github together with the code: > https://github.com/fschueler/emma/blob/sysml-dsl/emma-sysml-dsl/README.md > [3] > Please note that current development happens in the Emma project but > will move to an independent module in the SystemML project once the > necessary additions to Emma are merged. By having the DSL in a > separate > module, we can include Scala and Emma dependencies only for the users > that actually want to use the Scala DSL. > > The current code serves as a proof of concept to discuss further > development with the SystemML community. I especially welcome input > from > SystemML Scala users on the usability of the API design. > Next steps will include the translation from Scala code to DML with > support of all features currently supported in DML, including control > flow structures. > Also, a coherent way of executing the generated scripts from Scala and > > the interaction with outside data formats (such as Spark Dataframes) > will be integrated. > > I am happy to answer your questions and discuss the described approach > > here! > > Felix > > > > Links: > ------ > [1] > https://github.com/apache/incubator-systemml/blob/master/src/main/python/sy= stemml/defmatrix.py#L537 > [2] > http://researcher.watson.ibm.com/researcher/view.php?person=3Dus-npansar > [3] > https://github.com/fschueler/emma/blob/sysml-dsl/emma-sysml-dsl/README.md --1__=07BB0AAFDFF469608f9e8a93df938690918c07BB0AAFDFF46960 Content-Transfer-Encoding: quoted-printable Content-type: text/html; charset=US-ASCII Content-Disposition: inline

Maybe I'm missing a subtle point here, but why not refactor = the existing class org.apache.sysml.parser.DMLProgram into our common inter= nal representation across DSLs? This class is already sufficiently expressi= ve to represent any DML or PyDML program.

Fred

3D"InactiveNiketan Pansare---09/28/2016 11:20:11 AM---Thanks Felix for the respons= e. +1

From: = Niketan Pansare/Almaden/IBM@IBMUS
To: dev@systemml.incuba= tor.apache.org
Date: 09/28/2016 11:20 AM
Subject: Re: Proof of Concept: = Embedded Scala DSL





Thanks Felix = for the response.

+1
>>
For th= e future design I will probably make the Matrix and Vector classes
abst= ract which allows for different concrete implementations. We could
then= have one that is backed directly by SystemML and works similar to
the = Python DSL in that it just uses mock operators and builds the DML
strin= g that is then executed using SystemML. That way the deep embedding
wou= ld reuse the shallow embedding and we could offer the user to either
us= e the lazy MatrixType on the Repl or write code inside the macro.


Also, I agree that we can postpone the IR and in= tegration of different DSLs until the work on parallelize is completed.
=
Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail= : npansar At us.ibm.com

http://researcher.w= atson.ibm.com/researcher/view.php?person=3Dus-npansar

fschueler---09/= 28/2016 10:54:37 AM---Hi Niketan, thanks for your suggestions! I thought ab= out it a bit and here are my

From:
fschueler@posteo.deTo: dev@systemml.incubator.apache.org
D= ate:
09/28/2016 10:54 AM
Subject:
= Re: Proof of Concept: Embedded Scala DSL




Hi Niketan,

thanks for your suggestions!= I thought about it a bit and here are my
ideas on it:

The IR yo= u are describing is basically already my user facing API. I am
not sure= how much sense it makes to have an IR that looks exactly like
the API = but with control structures renamed. A common IR for all DSLs
definitel= y makes sense in general but I am not sure if it should be part
of one = particular DSL. For maintainability it might be better to have
that IR = somewhere on the SystemML side.

Apart from that and to what Matthias= suggested, I thought about how to
make the DSL more suitable for using= on the Repl and I think we can find
a good compromise. Currently my AP= I is backed by breeze for rapid
prototyping where breeze just forces ev= aluation of every statement. For
the future design I will probably make= the Matrix and Vector classes
abstract which allows for different conc= rete implementations. We could
then have one that is backed directly by= SystemML and works similar to
the Python DSL in that it just uses mock= operators and builds the DML
string that is then executed using System= ML. That way the deep embedding
would reuse the shallow embedding and w= e could offer the user to either
use the lazy MatrixType on the Repl or= write code inside the macro.

I haven't started playing around with = this idea but let me know what you
think of it. The lazy, shallow DSL w= ould basically do what you would
want from a seperate IR, but i don't k= now if you want to call that from
the python DSL.

Felix

A= m 24.09.2016 19:39 schrieb Niketan Pansare:
> Hi Felix,
>
&= gt; Thanks for the summary. The document is extremely useful. I
> par= ticularly like the idea of parallelizing the code with 'breeze'
> lib= rary. I would like to pitch in few ideas which would enable your
> co= de to be reused by other DSLs:
> 1. Scala DSL/parallelize macro remai= ns the same as described in your
> documentation, but instead of gene= rating DML directly, we call an
> intermediate representation (IR). T= his IR then generates DML (instead
> of generating DML directly by pa= rallelize). This IR will be then
> reused by Python DSL and R DSL.> 2. As an example, IR could be a lazy Matrix class (which would be par= t
> of SystemML). It could have awkward syntax/mechanism for pushing = down
> control structures for example: beginWhile and endWhile. Since= IR will
> not be exposed to the end-user, it should be fine.
>=
> Example:
>
https://github.com/apache/incuba= tor-systemml/blob/master/src/main/python/systemml/defmatrix.py#L537<= /u>
> [1] will call IR's add() method. = At the end of parallelize or when the
> user wants result (i.e. eval(= ) ), IR could generate DML code and
> execute it.
>
> Ag= ain, this is just a proposal and am fine dropping the idea of
> integ= rating different DSL if it makes the implementation of Scala DSL
> co= mplicated. Also, please feel free to correct me if I am missing
> any= thing.
>
> Thanks,
>
> Niketan Pansare
> IB= M Almaden Research Center
> E-mail: npansar At us.ibm.com
>
http://researc= her.watson.ibm.com/researcher/view.php?person=3Dus-npansar<= /a>
> [2]
>
> Matthias Boehm---09/2= 4/2016 01:11:36 AM---thanks for sharing the
> summary - this is very = nice. While looking over the example, I had the
> follow
>
= > From: Matthias Boehm/Almaden/IBM@IBMUS
> To: dev@systemml.incuba= tor.apache.org
> Date: 09/24/2016 01:11 AM
> Subject: Re: Proof= of Concept: Embedded Scala DSL
>
> -------------------------<= br>>
> thanks for sharing the summary - this is very nice. While = looking over
> the example, I had the following questions:
> > 1) Output handling: It would be great to see an example how the
&= gt; results of Algorithm.execute() are consumed. Do you intend to hand out<= br>> our binary matrix representation or MLContext's Matrix from which t= he
> user then requests specific output formats? Also if there are mu= ltiple
> Algorithm instances, how is the MLContext (with its internal= state of
> lazily evaluated intermediates) reused?
>
> = 2) Scala-breeze prototyping: How do you intend to support operations
>= ; that are not supported in breeze? Examples are removeEmpty, table,
>= ; aggregate, rowIndexMax, quantile/centralmoment, cummin/cummax, and DNN> operations?
>
> 3) Frame data type and operations: Do yo= u also intend to add a frame
> type and its operations? I think for t= his initial prototype it is not
> necessarily required but please mak= e the scope explicit.
>
> Regards,
> Matthias
> > fschueler---09/23/2016 04:36:14 PM---As discussed in the related Jir= a
> (SYSTEMML-451) I have started to implement a prototype/proof of c= o
>
> From: fschueler@posteo.de
> To: dev@systemml.incub= ator.apache.org
> Date: 09/23/2016 04:36 PM
> Subject: Proof of= Concept: Embedded Scala DSL
>
> -------------------------
= >
> As discussed in the related Jira (SYSTEMML-451) I have starte= d to
> implement a prototype/proof of concept for an embedded DSL in = Scala.
>
> I have summarized the current approach in a short d= ocument that you
> can
> find on github together with the code:=
>
h= ttps://github.com/fschueler/emma/blob/sysml-dsl/emma-sysml-dsl/README.md
> [3]
> Please note that= current development happens in the Emma project but
> will move to a= n independent module in the SystemML project once the
> necessary add= itions to Emma are merged. By having the DSL in a
> separate
> = module, we can include Scala and Emma dependencies only for the users
&g= t; that actually want to use the Scala DSL.
>
> The current co= de serves as a proof of concept to discuss further
> development with= the SystemML community. I especially welcome input
> from
> Sy= stemML Scala users on the usability of the API design.
> Next steps w= ill include the translation from Scala code to DML with
> support of = all features currently supported in DML, including control
> flow str= uctures.
> Also, a coherent way of executing the generated scripts fr= om Scala and
>
> the interaction with outside data formats (su= ch as Spark Dataframes)
> will be integrated.
>
> I am h= appy to answer your questions and discuss the described approach
> > here!
>
> Felix
>
>
>
> Link= s:
> ------
> [1]
>
https://github.com/ap= ache/incubator-systemml/blob/master/src/main/python/systemml/defmatrix.py#L= 537
> [2]
>
http://researcher.= watson.ibm.com/researcher/view.php?person=3Dus-npansar<= tt>
> [3]
>
https://github.com/fschueler/emma/blob/sysm= l-dsl/emma-sysml-dsl/README.md

=






--1__=07BB0AAFDFF469608f9e8a93df938690918c07BB0AAFDFF46960-- --0__=07BB0AAFDFF469608f9e8a93df938690918c07BB0AAFDFF46960--