Return-Path: X-Original-To: apmail-systemml-dev-archive@minotaur.apache.org Delivered-To: apmail-systemml-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E4976199F0 for ; Fri, 8 Apr 2016 20:12:17 +0000 (UTC) Received: (qmail 64664 invoked by uid 500); 8 Apr 2016 20:12:17 -0000 Delivered-To: apmail-systemml-dev-archive@systemml.apache.org Received: (qmail 64619 invoked by uid 500); 8 Apr 2016 20:12:17 -0000 Mailing-List: contact dev-help@systemml.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@systemml.incubator.apache.org Delivered-To: mailing list dev@systemml.incubator.apache.org Received: (qmail 64608 invoked by uid 99); 8 Apr 2016 20:12:17 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Apr 2016 20:12:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 16D87180525 for ; Fri, 8 Apr 2016 20:12:17 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -3.02 X-Spam-Level: X-Spam-Status: No, score=-3.02 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, TVD_FW_GRAPHIC_NAME_MID=0.001] autolearn=disabled Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id OZy3qj786kl0 for ; Fri, 8 Apr 2016 20:12:14 +0000 (UTC) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.151]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id 2E7E35FACD for ; Fri, 8 Apr 2016 20:12:13 +0000 (UTC) Received: from localhost by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 8 Apr 2016 14:12:11 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 8 Apr 2016 14:12:09 -0600 X-IBM-Helo: d03dlp02.boulder.ibm.com X-IBM-MailFrom: npansar@us.ibm.com X-IBM-RcptTo: dev@systemml.incubator.apache.org Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 1F42F3E40081 for ; Fri, 8 Apr 2016 14:11:47 -0600 (MDT) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u38KBlD143974700 for ; Fri, 8 Apr 2016 13:11:47 -0700 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u38KBkM8030848 for ; Fri, 8 Apr 2016 14:11:46 -0600 Received: from d50lp31.co.us.ibm.com (d50lp31.boulder.ibm.com [9.17.249.32]) by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id u38KBkXj030818 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Fri, 8 Apr 2016 14:11:46 -0600 Message-Id: <201604082011.u38KBkXj030818@d03av04.boulder.ibm.com> Received: from localhost by d50lp31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 8 Apr 2016 14:11:46 -0600 Received: from smtp.notes.na.collabserv.com (192.155.248.82) by d50lp31.co.us.ibm.com (192.168.2.141) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256/256) Fri, 8 Apr 2016 14:11:44 -0600 X-IBM-Helo: smtp.notes.na.collabserv.com X-IBM-MailFrom: npansar@us.ibm.com X-IBM-RcptTo: dev@systemml.incubator.apache.org Received: from /spool/local by smtp.notes.na.collabserv.com with smtp.notes.na.collabserv.com ESMTP for from ; Fri, 8 Apr 2016 20:11:43 -0000 Received: from us1a3-smtp01.a3.dal06.isc4sb.com (10.106.154.95) by smtp.notes.na.collabserv.com (10.106.227.105) with smtp.notes.na.collabserv.com ESMTP; Fri, 8 Apr 2016 20:11:41 -0000 Received: from us1a3-mail56.a3.dal09.isc4sb.com ([10.142.3.44]) by us1a3-smtp01.a3.dal06.isc4sb.com with ESMTP id 2016040820113954-356837 ; Fri, 8 Apr 2016 20:11:39 +0000 MIME-Version: 1.0 Subject: Fw: Updating documentation for notebook To: dev From: "Niketan Pansare" Date: Fri, 8 Apr 2016 13:11:41 -0700 X-KeepSent: 258FB299:8C654F9D-00257F8F:006E2DA9; type=4; name=$KeepSent X-Mailer: IBM Notes Release 9.0.1FP5 SHF106 December 12, 2015 X-LLNOutbound: False X-Disclaimed: 29203 X-TNEFEvaluated: 1 Content-type: multipart/related; Boundary="0__=8FBBF51CDFFDAB398f9e8a93df938690918c8FBBF51CDFFDAB39" x-cbid: 16040820-0009-0000-0000-00001ACE1819 X-IBM-ISS-SpamDetectors: Score=0.381106; BY=0.118127; FL=0; FP=0; FZ=0; HX=0; KW=0; PH=0; SC=0.381106; ST=0; TS=0; UL=0; ISC= X-IBM-ISS-DetailInfo: BY=3.00005135; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000157; SDB=6.00685229; UDB=6.00315047; UTC=2016-04-08 20:11:42 x-cbparentid: 16040820-9900-0000-0000-000007146D4E X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER --0__=8FBBF51CDFFDAB398f9e8a93df938690918c8FBBF51CDFFDAB39 Content-type: multipart/alternative; Boundary="1__=8FBBF51CDFFDAB398f9e8a93df938690918c8FBBF51CDFFDAB39" --1__=8FBBF51CDFFDAB398f9e8a93df938690918c8FBBF51CDFFDAB39 Content-Transfer-Encoding: quoted-printable Content-type: text/plain; charset=US-ASCII Hi all, Here are few suggestions to get things started: 1. Have a "Quick Start" (or "Get Started") button besides "Get SystemML" on http://systemml.apache.org/. 2. Then user can go through following questionnaire/bulleted list which points people to appropriate link: - How do you want try SystemML ? + Notebook on cloud * Bluemix + Zeppelin - Using Python Kernel + Learn how to write DML program (something along the lines of http://apache.github.io/incubator-systemml/beginners-guide-to-dml-and-pydml= .html ) + Try out pre-packaged algorithms on real-world dataset * Linear Regression * GLM * ALS * ... + Learn how to pass RDD/DataFrame to SystemML (for example: http://apache.github.io/incubator-systemml/spark-mlcontext-programming-guid= e.html ) + Learn how to use SystemML as MLPipeline estimator/transformer + Learn how to use SystemML with existing Python packages - Using Scala Kernel + ... similar to Python kernel - Using DML Kernel + Learn how to write DML program + Jupyter - Using Python Kernel - Using Scala Kernel - Using DML Kernel * Data scientist's work bench * Databricks cloud * ... + Notebook on laptop/cluster * Zeppelin using docker images (for example: http://apache.github.io/incubator-systemml/spark-mlcontext-programming-guid= e.html#zeppelin-notebook-example---linear-regression-algorithm ) * Jupyter (for example: http://apache.github.io/incubator-systemml/spark-mlcontext-programming-guid= e.html#jupyter-pyspark-notebook-example---poisson-nonnegative-matrix-factor= ization ) + Laptop * Run SystemML as Standalone jar: http://apache.github.io/incubator-systemml/quick-start-guide.html * Embed SystemML into other Java program: http://apache.github.io/incubator-systemml/jmlc.html * Debug a DML script: http://apache.github.io/incubator-systemml/debugger-guide.html * Spark local mode + Spark Cluster * Batch invocation * Using Spark REPL + Learn how to pass RDD/DataFrame to SystemML + Learn how to use SystemML as MLPipeline estimator/transformer * Using PySpark REPL + Learn how to pass RDD/DataFrame to SystemML + Learn how to use SystemML as MLPipeline estimator/transformer + Hadoop Cluster + Spark Cluster on EC2 3. Add links to SystemML presentations: https://www.youtube.com/watch?v=3Dn3JJP6UbH6Q https://www.youtube.com/watch?v=3D6VpiJK8Jydw https://www.youtube.com/watch?v=3DPV-5pZboo4A https://www.youtube.com/watch?v=3D7Zrc5EzOTjg https://www.youtube.com/watch?v=3D3T32lweGxOA Thanks, Niketan Pansare IBM Almaden Research Center E-mail: npansar At us.ibm.com http://researcher.watson.ibm.com/researcher/view.php?person=3Dus-npansar ----- Forwarded by Niketan Pansare/Almaden/IBM on 04/08/2016 01:03 PM ----- = =20 = =20 = =20 Re: Updating documentation for notebook = =20 = =20 = =20 Niketan Pansare = =20 to: = =20 dev = =20 0= =20 4= =20 /= =20 0= =20 8= =20 /= =20 2= =20 0= =20 1= =20 6= =20 1= =20 0= =20 := =20 4= =20 7= =20 A= =20 M= =20 = =20 = =20 Please respond to dev = =20 = =20 = =20 = =20 Thanks Abhishek. I am glad it was helpful :) Luciano: I agree with you about having a central place for documentation. Before cleaning up the tutorial and putting it into our documentation, I wanted to: 1. Have a discussion about which setup should we use to introduce SystemML: command-line standalone, command-line spark/pyspark REPL (yarn/standalone), command-line hadoop, scala/python notebook (online notebook or require user to setup jupyter/zeppelin). 2. Encourage other contributors to come up with intellectually simulating tutorial using real world dataset and our existing DML algorithms. This means creating JIRAs that people can work on. My repository is only a POC to facilitate discussion and will be deleted after that. 3. If we do decide to go with online notebook based tutorial, have a discussion on how to structure the tutorial: - so as to support variety of hosting sites (bluemix / datascientist workbench / databricks cloud / azureml / aws / ...). - Python or Scala as primary language. - Jupyter or Zeppelin as primary notebook. - DML kernel or MLContext-based or JMLC-based example. - Any standard tutorial (or textbook) we should use as example for choosing the dataset. - Whether the emphasis should be on learning DML or on building larger data pipeline (for example: our MLPipeline-wrapper). Thanks, Niketan Pansare IBM Almaden Research Center E-mail: npansar At us.ibm.com http://researcher.watson.ibm.com/researcher/view.php?person=3Dus-npansar Abhishek Srivastava ---04/08/2016 08:55:58 AM---Great job Niketan , I had been searching for such document off late. Regards, From: Abhishek Srivastava To: dev@systemml.incubator.apache.org Date: 04/08/2016 08:55 AM Subject: Re: Updating documentation for notebook Great job Niketan , I had been searching for such document off late. Regards, Abhishek Srivastava Fellowship Scholar , IIM Ranchi Skype : abhi.sri3 On Fri, Apr 8, 2016 at 6:34 AM, Niketan Pansare wrote: > > > Hi all, > > Here is a suggestion for reducing the barrier to entry for SystemML: "Have > a detailed quickstart guide/video using Notebook on free (or trial-based) > hosting solution like IBM Bluemix or Data Scientist Workbench". > > I have create a sample tutorial: > https://github.com/niketanpansare/systemml=5Ftutorial > > Missing items in above tutorial: > 1. Create a separate section for Notebook rather than have it hidden under > MLContext Programming guide ( > > http://apache.github.io/incubator-systemml/spark-mlcontext-programming-guid= e.html > ). > 2. Add Python Notebooks (This requires attaching both jars and python > MLContext to Zeppelin or Jupyter context). > 3. Allow users to use jars from our nightly build (see my jupyter example) > as well as released version (see my zeppelin example). > 4. Tutorials for all our algorithms using real world dataset. Example: > > https://www.ibm.com/support/knowledgecenter/SSPT3X=5F2.1.2/com.ibm.swg.im.i= nfosphere.biginsights.tut.doc/doc/tut=5FMod=5FBigR.html > . > 5. DML Kernel for Zeppelin (see > https://issues.apache.org/jira/browse/SYSTEMML-542). > 6. Other hosting services such as AzureML. > 7. Tutorial that shows SystemML's integration with MLPipeline. > > These missing items can be broken down into relatively small tasks with > detailed specification that external contributors can work on. Any > thoughts ? > > Thanks, > > Niketan Pansare > IBM Almaden Research Center > E-mail: npansar At us.ibm.com > http://researcher.watson.ibm.com/researcher/view.php?person=3Dus-npansar > --1__=8FBBF51CDFFDAB398f9e8a93df938690918c8FBBF51CDFFDAB39 Content-Transfer-Encoding: quoted-printable Content-type: text/html; charset=US-ASCII Content-Disposition: inline

Hi all,

Here are few suggestions to get things starte= d:
1. Have a "Quick Start" (or "Get Started") button= besides "Get SystemML" on http://systemml.apache.org/.

2. Then user can go through foll= owing questionnaire/bulleted list which points people to appropriate link:<= br>- How do you want try SystemML ?
+ Notebook on cloud
= * Bluemix
+ Zeppelin
= - Using Python Kernel
= + Learn how to write DML program (something along the lines= of http://apache.github.io/incubator-systemml/beginners-= guide-to-dml-and-pydml.html)
= + Try out pre-packaged algorithms on real-world dataset
= * Linear Regression
= * GLM
= * ALS
= * ...
+ Learn how to= pass RDD/DataFrame to SystemML (for example: http://apa= che.github.io/incubator-systemml/spark-mlcontext-programming-guide.html= )
+ Learn how to use SystemML= as MLPipeline estimator/transformer
= + Learn how to use SystemML with existing Python packages
= - Using Scala Kernel
= + ... similar to Python kernel
= - Using DML Kernel
= + Learn how to write DML program
= + Jupyter
- Using Python Kernel
= - Using Scala Kernel
= - Using DML Kernel
* Data scientist's work be= nch
* Databricks cloud
* ...

= + Notebook on laptop/cluster
* Zeppelin using docker im= ages (for example: http://apache.github.io/incubator-systemml/spark-mlconte= xt-programming-guide.html#zeppelin-notebook-example---linear-regression-alg= orithm)
* Jupyter (for example: = http://apache.github.io/incubator-systemml/spark-mlcontext-programming-guid= e.html#jupyter-pyspark-notebook-example---poisson-nonnegative-matrix-factor= ization)

+ Laptop
* Run Sys= temML as Standalone jar: http://apache.github.io/incubator-systemml/quic= k-start-guide.html
* Embed SystemML into other Java = program: h= ttp://apache.github.io/incubator-systemml/jmlc.html
= * Debug a DML script: http://apache.github.io/incubator-systemml/debugger-g= uide.html
* Spark local mode

= + Spark Cluster
* Batch invocation
= * Using Spark REPL
+ Learn how to pass RDD/DataF= rame to SystemML
+ Learn how to use SystemML as = MLPipeline estimator/transformer
* Using PySpark REPL + Learn how to pass RDD/DataFrame to SystemML
= + Learn how to use SystemML as MLPipeline estimator/= transformer

+ Hadoop Cluster
+ Spark Cluster on EC2

3.= Add links to SystemML presentations:
https://www.youtube.com/watch?v=3Dn3JJP6UbH6Qhttps://www.you= tube.com/watch?v=3D6VpiJK8Jydw
https://www.youtube.com/watch?v=3DPV-5pZboo4A
<= a href=3D"https://www.youtube.com/watch?v=3D7Zrc5EzOTjg">https://www.youtub= e.com/watch?v=3D7Zrc5EzOTjg
https://www.youtube.com/watch?v=3D3T32lweGxOA

= Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: np= ansar At us.ibm.com
http://researcher.watson.ibm.com/researche= r/view.php?person=3Dus-npansar

----- Forwarded by Niketan Pansare/Almaden/IBM on 04/08/2016 01:03 PM -----

3D""
3D""
Re: Updating documentation for noteboo= k
<= img width=3D"120" height=3D"1" src=3D"cid:2=5F=5F=3D8FBBF51CDFFDAB398f9e8a9= 3df938690918c8FB@" border=3D"0" alt=3D"">
Nik= etan Pansare
3D""
to:<= /font>
3D""
dev
3D""
04/08/= 2016 10:47 AM
3D""
Please respond t= o dev
3D""
<= br>Thanks Abhishek. I am glad it was helpful :)

Luc= iano: I agree with you about having a central place for documentation. Befo= re cleaning up the tutorial and putting it into our documentation, I wanted= to:
1. Have a discussion about which setup should we use to introduce S= ystemML: command-line standalone, command-line spark/pyspark REPL (yarn/sta= ndalone), command-line hadoop, scala/python notebook (online notebook or re= quire user to setup jupyter/zeppelin).
2. Encourage other contributors t= o come up with intellectually simulating tutorial using real world dataset = and our existing DML algorithms. This means creating JIRAs that people can = work on. My repository is only a POC to facilitate discussion and will be d= eleted after that.
3. If we do decide to go with online notebook based t= utorial, have a discussion on how to structure the tutorial:
- so as to = support variety of hosting sites (bluemix / datascientist workbench / datab= ricks cloud / azureml / aws / ...).
- Python or Scala as primary languag= e.
- Jupyter or Zeppelin as primary notebook.
- DML kernel or MLConte= xt-based or JMLC-based example.
- Any standard tutorial (or textbook) we= should use as example for choosing the dataset.
- Whether the emphasis = should be on learning DML or on building larger data pipeline (for example:= our MLPipeline-wrapper).

Thanks,

Niketan Pansare
IBM Alma= den Research Center
E-mail: npansar At us.ibm.com

http://researcher.watson.ibm.com/researcher/view.php?person=3Dus-= npansar

Abhishek Srivastava ---04/08/2016 08:55:58 AM---Great job Nik= etan , I had been searching for such document off late. Regards,

From:
Abhishek S= rivastava <abhisheksrivastava3@gmail.com>
= To:
dev@systemml.incubator.apache.org
Dat= e:
04/08/2016 08:55 AM
Subject:
Re= : Updating documentation for notebook





Great job Niketan , I had been searching for such= document off late.

Regards,
Abhishek Srivastava
Fellowship Sc= holar , IIM Ranchi
Skype : abhi.sri3

On Fri, Apr 8, 2016 at 6:34 = AM, Niketan Pansare <npansar@us.ibm.com> wrote:

>
>> Hi all,
>
> Here is a suggestion for reducing the barrie= r to entry for SystemML: "Have
> a detailed quickstart guide/vid= eo using Notebook on free (or trial-based)
> hosting solution like IB= M Bluemix or Data Scientist Workbench".
>
> I have create = a sample tutorial:
>
htt= ps://github.com/niketanpansare/systemml=5Ftutorial<= font size=3D"4">
>
> Missing items in above tutorial:
> 1= . Create a separate section for Notebook rather than have it hidden under> MLContext Programming guide (
>
>
http://apache.github.io/= incubator-systemml/spark-mlcontext-programming-guide.html
> ).
> 2. Add Python Notebooks (This re= quires attaching both jars and python
> MLContext to Zeppelin or Jupy= ter context).
> 3. Allow users to use jars from our nightly build (se= e my jupyter example)
> as well as released version (see my zeppelin = example).
> 4. Tutorials for all our algorithms using real world data= set. Example:
>
>
https://www.ibm.com/support/knowledgecenter/SSPT3X=5F2.1.2/com.ibm.swg.im= .infosphere.biginsights.tut.doc/doc/tut=5FMod=5FBigR.html
> .
> 5. DML Kernel for Zeppelin (see>
https://issues.apache.org/= jira/browse/SYSTEMML-542).
>= 6. Other hosting services such as AzureML.
> 7. Tutorial that shows = SystemML's integration with MLPipeline.
>
> These missing items= can be broken down into relatively small tasks with
> detailed speci= fication that external contributors can work on. Any
> thoughts ?
= >
> Thanks,
>
> Niketan Pansare
> IBM Almaden Re= search Center
> E-mail: npansar At us.ibm.com
>
http://researcher.watson.i= bm.com/researcher/view.php?person=3Dus-npansar
>





--1__=8FBBF51CDFFDAB398f9e8a93df938690918c8FBBF51CDFFDAB39-- --0__=8FBBF51CDFFDAB398f9e8a93df938690918c8FBBF51CDFFDAB39--