mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Twigg <andy.tw...@gmail.com>
Subject Re: Classification Algorithms in Mahout
Date Wed, 27 Mar 2013 22:12:06 GMT
Dear Ey-Chih,

What are your use cases for a better random forest?

On 27 March 2013 11:59, Yutaka Mandai <20525entradero@gmail.com> wrote:
> My understanding of current Random Forrest has a certain level of improvement  for running
on Hadoop cluster from data splitting alignment perspective for better balanced CPU utilization.
> Regards,,,
> Y.Mandai
>
> iPhoneから送信
>
> On 2013/03/25, at 14:48, Ted Dunning <ted.dunning@gmail.com> wrote:
>
>> I think that there are some others who could say more.
>>
>> On Mon, Mar 25, 2013 at 6:01 AM, Ey-Chih chow <eychih@gmail.com> wrote:
>>
>>> On Mar 24, 2013, at 1:00 AM, Ted Dunning wrote:
>>>
>>>> - random forest, sequential and parallel implementations, new versions
>>> are being developed, the current version may or may not be useful to you.
>>>>
>>> Can you elaborate the usefulness of the current version and features of
>>> the new versions?  Thanks.
>>>
>>> Ey-Chih Chow
>>>
>>>
>>> On Mar 24, 2013, at 1:00 AM, Ted Dunning wrote:
>>>
>>>> You are correct to suspect that this page is substantially out of date.
>>>>
>>>> Currently, Mahout has the following classifiers:
>>>>
>>>> - stochastic gradient descent for logistic regression (SGD) with L_1 or
>>> L_2 regularization, sequential version only.  These classifiers can be
>>> easily extended with other gradients and regularizers which should make
>>> linear SVM's easy to implement.
>>>>
>>>> - naive bayes, sequential and parallel implementations
>>>>
>>>> - random forest, sequential and parallel implementations, new versions
>>> are being developed, the current version may or may not be useful to you.
>>>>
>>>> There are a variety of other classifiers which are in various states of
>>> utility.
>>>>
>>>> On Mar 24, 2013, at 4:07 AM, Chidananda Sridhar wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am doing a class project on classification and want to use Mahout.
I
>>> was
>>>>> searching for the classification algorithms already implemented in
>>> Mahout
>>>>> and came to this page:
>>>>> https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms
>>>>>
>>>>> The webpage says that Online Passive
>>>>> Aggressive<
>>> https://cwiki.apache.org/confluence/display/MAHOUT/Online+Passive+Aggressive
>>>> is
>>>>> integrated and the rest of the classification algorithms are open or
>>>>> awaiting commit. Does the webpage have the latest information, or is
it
>>> yet
>>>>> to be updated? Is "Online Passive Aggressive" the only algorithm I can
>>> use
>>>>> for now? On the other hand, I see that most of the clustering algorithms
>>>>> have been integrated.
>>>>>
>>>>> Thanks,
>>>>> Chidananda
>>>>
>>>
>>>



--
Dr Andy Twigg
Junior Research Fellow, St Johns College, Oxford
Room 351, Department of Computer Science
http://www.cs.ox.ac.uk/people/andy.twigg/
andy.twigg@cs.ox.ac.uk | +447799647538

Mime
View raw message