incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <jwi...@cloudera.com>
Subject Re: Flume R -- any interest?
Date Wed, 17 Oct 2012 22:34:20 GMT
On Wed, Oct 17, 2012 at 3:07 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:
> Yep, ok.
>
> I imagine it has to be an R module so I can set up a maven project
> with java/R code tree (I have been doing that a lot lately). Or if you
> have a template to look at, it would be useful i guess too.

No, please go right ahead.

>
>
> On Wed, Oct 17, 2012 at 3:02 PM, Josh Wills <josh.wills@gmail.com> wrote:
>> I'd like it to be separate at first, but I am happy to help. Github repo?
>> On Oct 17, 2012 2:57 PM, "Dmitriy Lyubimov" <dlieu.7@gmail.com> wrote:
>>
>>> Ok maybe there's a benefit to try a JRI/RJava prototype on top of
>>> Crunch for something simple. This should both save time and prove or
>>> disprove if Crunch via RJava integration is viable.
>>>
>>> On my part i can try to do it within Crunch framework or we can keep
>>> it completely separate.
>>>
>>> -d
>>>
>>> On Wed, Oct 17, 2012 at 2:08 PM, Josh Wills <jwills@cloudera.com> wrote:
>>> > I am an avid R user and would be into it-- who gave the talk? Was it
>>> > Murray Stokely?
>>> >
>>> > On Wed, Oct 17, 2012 at 2:05 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
>>> wrote:
>>> >> Hello,
>>> >>
>>> >> I was pretty excited to learn of Google's experience of R mapping of
>>> >> flume java on one of recent BARUGs. I think a lot of applications
>>> >> similar to what we do in Mahout could be prototyped using flume R.
>>> >>
>>> >> I did not quite get the details of Google implementation of R mapping,
>>> >> but i am not sure if just a direct mapping from R to Crunch would be
>>> >> sufficient (and, for most part, efficient). RJava/JRI and jni seem to
>>> >> be a pretty terrible performer to do that directly.
>>> >>
>>> >>
>>> >> on top of it, I am thinknig if this project could have a contributed
>>> >> adapter to Mahout's distributed matrices, that would be just a very
>>> >> good synergy.
>>> >>
>>> >> Is there anyone interested in contributing/advising for open source
>>> >> version of flume R support? Just gauging interest, Crunch list seems
>>> >> like a natural place to poke.
>>> >>
>>> >> Thanks .
>>> >>
>>> >> -Dmitriy
>>> >
>>> >
>>> >
>>> > --
>>> > Director of Data Science
>>> > Cloudera
>>> > Twitter: @josh_wills
>>>



-- 
Director of Data Science
Cloudera
Twitter: @josh_wills

Mime
View raw message