mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy Lyubimov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (MAHOUT-1490) Data frame R-like bindings
Date Tue, 09 Sep 2014 17:30:29 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127267#comment-14127267
] 

Dmitriy Lyubimov edited comment on MAHOUT-1490 at 9/9/14 5:30 PM:
------------------------------------------------------------------

Just so we are clear, i don't see much value in that issue any longer. Whatever happens here,
seems to be a dupe of ongoing efforts elsewhere. 

(1) MLI -- well this has not been moving anywhere for past year, at least not publicly, and
the approach, honestly, is not what i've been thinking in terms of DSL for data frames. 

But: 
(2) (Spark) 1.0 and 1.1 made a big progress in terms of Spark QL, and in particular, _language-integrated
QL_, which is very close to my idea about DF DSL; 
(3) There's a new project, DDF (distributed data frames) on Spark, which is sounds awfully
close in ideas too. 

So in my pragmatical view, any further work on this issue will just duplicate (2) and (3),
and therefore is of much less pragmatical sense (especially given (2) and (3) works in Spark
environment directly, i.e. can co-exist with Mahout Spark bindings literally elbow to elbow).
My professional strategic position is now re-oriented on expectation of maturation of either
(2) or (3), given they are receiving much more backing than this issue will likely ever have.
So i suggest the Mahout community just to abandon this effort, it is in all likelihood just
a waste of time. For people who are interested in this capability in conjunction with Scala
bindings, i'd rather recommend join the forces with either (2) and/or (3) and perhaps bring
them to a faster fruition.



was (Author: dlyubimov):
Just so we are clear, i don't see much value in that issue any longer. Whatever happens here,
seems to be a dupe of ongoing efforts elsewhere. 

(1) MLI -- well this has not been moving anywhere for past year, at least not publicly, and
the approach, honestly, is not what i've been thinking in terms of DSL for data frames. 

But: 
(2) (Spark) 1.0 and 1.1 made a big progress in terms of Spark QL, and in particular, _language-integrated
QL_, which is very close to my idea about DF DSL; 
(3) There's a new project, DDF (distributed data frames) on Spark, which is sounds awfully
close in ideas too. 

So in my pragmatical view, any further work on this issue will just duplicate (2) and (3),
and therefore is of much less pragmatical sense (especially given (2) and (3) works in Spark
environment directly, i.e. can co-exist with Mahout Spark bindings literally elbow to elbow).
My professional strategic position is now re-oriented on expectation of maturation of either
(2) or (3), given they are receiving much more backing than this issue will likely ever have.
So i suggest the Mahout community just to abandon this effort, it is in all likelihood just
a waste of time. For people who are interested in this capability in conjunction with Scala
bindings, i'd rather recommend join the forces with either (2) and (3) and perhaps bring them
to a faster fruition.


> Data frame R-like bindings
> --------------------------
>
>                 Key: MAHOUT-1490
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1490
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Saikat Kanjilal
>            Assignee: Grant Ingersoll
>             Fix For: 1.0
>
>   Original Estimate: 20h
>  Remaining Estimate: 20h
>
> Create Data frame R-like bindings for spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message