systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niketan Pansare" <>
Subject Re: DML in Zeppelin
Date Tue, 08 Mar 2016 18:30:13 GMT

Hi Nakul,

This is good work !

My 2 cents, we should add missing features (such as command-line
arguments), document the API for this POC, come up with examples for
existing algorithms with open-source datasets and put them in

This way, people are encouraged to try out (and may be even modify
on-the-fly the) existing DML algorithms with specific datasets. Borrowing
an example from
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> digits = datasets.load_digits()
>>> from sklearn import svm
>>> clf = svm.SVC(gamma=0.001, C=100.)
>>> clf.predict([-1:])

We can then put a link to the given example in


Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At

From:	Nakul Jindal <>
Date:	03/06/2016 07:22 PM
Subject:	DML in Zeppelin


I've put together a proof of concept for having DML be a first class
citizen in Apache Zeppelin.

Brief intro to Zeppelin -
Zeppelin is a "notebook" interface to interact with Spark, Cassandra, Hive
and other projects. It can be thought of as a REPL in a browser.
Small units of code are put into "cell"s. These individual "cells" can then
be run interactively. Of course there is support for queue-ing up and
running cells in parallel.
Cells are contained in notebooks. Notebooks can be exported and are
persistent between sessions.

One can type code in (Scala) Spark in cell 1 and save a data frame object.
He can then type code in PySpark in cell 2 and access the previously saved
data frame.
This is done by the Zeppelin runtime system by injecting a special variable
called "z" into the Spark and PySpark environments in Zeppelin. This "z" is
an object of type ZeppelinContext and makes available a "get" and a "put"
DML in Spark mode can now access this feature as well.

In this POC, DML can operate in 2 modes - standalone and spark.

Screenshots of it working:

GIF of the screenshots:



Nakul Jindal

  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message