systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ethan Xu <ethan.yifa...@gmail.com>
Subject Re: Documents of SystemML Algorithms Reference
Date Mon, 01 May 2017 13:39:18 GMT
Thanks Niketan! This helps a lot.

Best,

Ethan

On Wed, Apr 19, 2017 at 5:21 PM, Niketan Pansare <npansar@us.ibm.com> wrote:

> Hi Ethan,
>
> Good points, the documentation is incomplete. The Arguments section only
> describes the arguments for command-line invocation and not via Python and
> Scala. This should be clearly marked to avoid confusion.
>
> The Python wrappers are implemented to be compatible with MLLib and Scikit
> learn.
>
> For training, you can pass features and labels as
> 1. Scikit-learn way: two Python objects (X_train, y_train) of type numpy,
> pandas or scipy.
> model.fit(X_train, y_train)
>
> OR
>
> 2. MLLib way: one LabeledPoint DataFrame with atleast two columns:
> features (of type Vector) and labels.
> model.fit(X_df)
>
> For prediction, you can pass features as
> 1. Scikit-learn way: one Python object (X_test) of type numpy, pandas or
> scipy.
> model.predict(X_test)
>
> OR
>
> 2. MLLib way: one LabeledPoint DataFrame (df_test) with atleast one
> column: features (of type Vector).
> model.transform(df_test)
>
> The usage is briefly described in https://apache.github.io/
> incubator-systemml/beginners-guide-python.html#invoke-systemmls-algorithms
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> [image: Inactive hide details for Ethan Xu ---04/19/2017 02:07:34
> PM---Hello, I'm reading the documents on Multinomial Logistic Regress]Ethan
> Xu ---04/19/2017 02:07:34 PM---Hello, I'm reading the documents on
> Multinomial Logistic Regression (
>
> From: Ethan Xu <ethan.yifanxu@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 04/19/2017 02:07 PM
> Subject: Documents of SystemML Algorithms Reference
> ------------------------------
>
>
>
> Hello,
>
> I'm reading the documents on Multinomial Logistic Regression (
> https://apache.github.io/incubator-systemml/algorithms-
> classification.html#usage)
> with Scala API. It says
>
> val model = lr.fit(X_train_df)
> val prediction = model.transform(X_test_df)
>
>
> The "Arguments" section below it says:
>
> X: Location (on HDFS) to read the input matrix of feature vectors; each row
> constitutes one feature vector.
>
> Y: Location to read the input one-column matrix of category labels that
> correspond to feature vectors in X. Note the following:...
> The explanation of the arguments seem to correspond to the Hadoop and Spark
> API.
>
> Could someone please advise what are the specifications of `X_train_df` and
> `X_test_df`? Are they the same as specified in the Python API? i.e.:
>
> # X_train, y_train and X_test can be NumPy matrices or Pandas
> DataFrame or SciPy Sparse Matrixy_test = logistic.fit(X_train,
> y_train).predict(X_test)# df_train is DataFrame that contains two
> columns: "features" (of type Vector) and "label". df_test is a
> DataFrame that contains the column "features"
>
> The explanation of arguments for Python/Scala seem to be missing for other
> algorithms, too.
>
> Thanks a lot,
>
> Ethan
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message