mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Twigg <andy.tw...@gmail.com>
Subject Re: [jira] [Commented] (MAHOUT-1153) Implement streaming random forests
Date Mon, 03 Mar 2014 05:52:50 GMT
Yes, we could also consider committing it into the current mahout code
base. There are probably some advantages over the current impl. What
direction are you thinking?

On 2 March 2014 13:57, Suneel Marthi (JIRA) <jira@apache.org> wrote:
>
>     [ https://issues.apache.org/jira/browse/MAHOUT-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917581#comment-13917581
]
>
> Suneel Marthi commented on MAHOUT-1153:
> ---------------------------------------
>
> [~andytwigg]  I understand this has been implemented on Spark and an implementation is
 available at (http://featurestream.io), do u think we should start the conversation of rolling
this into Mahout?
>
>> Implement streaming random forests
>> ----------------------------------
>>
>>                 Key: MAHOUT-1153
>>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1153
>>             Project: Mahout
>>          Issue Type: New Feature
>>          Components: Classification
>>            Reporter: Andy Twigg
>>              Labels: features
>>             Fix For: Backlog
>>
>>
>> The current random forest implementations are in-core and not scalable. This issue
is to add an out-of-core, scalable, streaming implementation. Initially it could be based
on [1], and using mappers in a master-worker style.
>> [1] http://jmlr.csail.mit.edu/papers/volume11/ben-haim10a/ben-haim10a.pdf
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.2#6252)

Mime
View raw message