spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Burak Yavuz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-7785) Add missing items to pyspark.mllib.linalg.Matrices
Date Thu, 21 May 2015 23:57:17 GMT

    [ https://issues.apache.org/jira/browse/SPARK-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555313#comment-14555313
] 

Burak Yavuz commented on SPARK-7785:
------------------------------------

My belief on the Python linalg api so far has been that in Python, you have two beautiful
libraries called numpy and scipy. The overhead of serialization-deserialization is not worth
the implementation of wrappers, because numpy and scipy are backed by C and have vectorization
and such.

For linalg, we have been leveraging Breeze for a long time, only adding stuff for when the
performance can be greatly improved. If we can obtain better performance from numpy and scipy,
then let's just leverage them. Most of these methods were already named very similar to numpy
and scipy anyway.

> Add missing items to pyspark.mllib.linalg.Matrices
> --------------------------------------------------
>
>                 Key: SPARK-7785
>                 URL: https://issues.apache.org/jira/browse/SPARK-7785
>             Project: Spark
>          Issue Type: Sub-task
>          Components: MLlib, PySpark
>            Reporter: Manoj Kumar
>
> For DenseMatrices.
> Class Methods
> __str__, transpose
> Object Methods
> zeros, ones, eye, rand, randn, diag
> For SparseMatrices
> Class Methods
> __str__, transpose
> Object Methods,
> fromCoo, speye, sprand, sprandn, spdiag,
> Matrices Methods, horzcat, vertcat



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message