systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Boehm <mboe...@googlemail.com>
Subject Re: Local versions of Linear Algebra Operators in DML
Date Fri, 21 Oct 2016 20:00:38 GMT
thanks Nakul for reaching out before starting work on this. Actually, 
the introduction of these CP-only builtin functions was a big mistake 
because (as you already mentioned) they mistakenly suggest that we 
provide distributed operations for them too. The intend was to support 
them in later versions with our own local and distributed 
implementations. So far, this had low priority though because these 
O(n^3) operations are seldom used over large data. However, a while 
back, we lost potential users who were specifically interested in 
distributed eigen - so there are still use cases.

Despite the good intentions behind the renaming, I would strongly argue 
against it. First, it would unnecessarily lose compatibility with R 
syntax. Second, it would defeat our clean abstraction by exposing 
explicit local operations.

This leaves us with two options here: (1) you could use an external 
(java-implemented) function, which gives you virtually the same runtime 
behavior but a clear separation via an explicit registration, or (2) add 
it to the list of CP-only operations (with a plan to implement its 
distributed version) but name it 'svd' as in R.


Regards,
Matthias


On 10/21/2016 9:34 PM, Nakul Jindal wrote:
> Hi,
>
> Imran was planning on implementing a distributed SVD as a DML bodied
> function.
> The algorithm is described in the paper titled "A Distributed and
> Incremental SVD Algorithm for Agglomerative Data Analysis on Large
> Networks" available at https://arxiv.org/abs/1601.07010.
>
> This algorithm requires the availability of a local SVD function, which we
> currently do not have in SystemML.
> Seeing as how there are other linear algebra functions (eigen, lu, qr,
> cholesky) in DML that reroute to Apache Common Math and only operate in
> standalone/CP mode, would it be ok to add "svd" to this set?
>
> Also, since these operations are local and not distributed and the
> documentation doesn't make it clear that these operations wont operate in
> distributed mode, would it make sense to rename them to "local_eigen",
> "local_qr", "local_cholesky", etc?
> Obviously, this change would go into the version after 0.11.
>
> I understand that the ideal solution to this problem is to have a
> distributed version of the aforementioned linear algebra routines, but for
> the time being, would it be ok to go ahead do the rename, while also
> introducing a "local_svd" ?
>
>
> Niketan, Berthold, Matthias, Sasha - Any thoughts?
>
> Thanks,
> Nakul Jindal
>

Mime
View raw message