hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yexi Jiang <yexiji...@gmail.com>
Subject Re: Desicion Tree Implementation in Hadoop MapReduce
Date Mon, 02 Dec 2013 04:24:55 GMT
What is your motivation of using chaining jobs?


2013/12/1 unmesha sreeveni <unmeshabiju@gmail.com>

> Thanks Yexi...A very nice explanation...Thanks a lot..
> Explained in a very simple way which is really understandable for
> beginners..Thanks a lot.
> I can go for chaining jobs right?
>
>
>
>
>
> On Sun, Dec 1, 2013 at 8:55 PM, Yexi Jiang <yexijiang@gmail.com> wrote:
>
>> In my opinion.
>>
>> 1. Build the decision tree model with the training data.
>> 2. Store it somewhere.
>> 3. When the unlabeled data is available:
>>    3.1 if the unlabeled data is huge, write another mrjob to process
>> them, load the model at the setup stage, use the model to label the data
>> one by one in map stage. There is no necessary to have a reducer.
>>   3.2 if the unlabeled data is small, it is trivial.
>>
>>
>>
>>
>> 2013/12/1 unmesha sreeveni <unmeshabiju@gmail.com>
>>
>>> Thanks Yexi ,
>>>
>>> But how  it can be accomplished.
>>> The input to Desicion Tree MR will be a set of data. But while
>>> predicting a data it will be a one line data without classlabel right?
>>> So what changes will be there in mrjob.Should we design like this.
>>> 1. When a set of data is coming draw Desicion tree
>>> 2. else if a one line data is coming.check the output of decision
>>> tree(Decision tree generated from mr) and predict the class label.
>>>
>>> -------
>>>
>>> M1_train - dataset for training.
>>> M1_test - test data or prediction.
>>> 1. Will it be one data as input for prediction or  set of data given
>>> as input at-once.
>>> 2.we also need to ensure in our pgm that M1_test belongs to M1_train
>>> only. we shld check that also ...right? if M1_test is given into
>>> M2_train it should show error. is nt 'it?.
>>>
>>> Pls suggest if my thoughts are wrong.
>>>
>>> On 11/30/13, Yexi Jiang <yexijiang@gmail.com> wrote:
>>> > I watched the video in it but I cannot access its source code due to
>>> > permission issue.
>>> > In my opinion, once the decision tree model is built, the model is
>>> small
>>> > enough to be loaded into memory and can be used directly without
>>> another
>>> > mrjob for prediction. The prediction can be conducted in a streaming
>>> way.
>>> >
>>> >
>>> > 2013/11/30 unmesha sreeveni <unmeshabiju@gmail.com>
>>> >
>>> >> I have gone through a Map Reduce implementation of c4.5 in
>>> >>
>>> http://btechfreakz.blogspot.in/2013/04/implementation-of-c45-algorithm-using.html
>>> >>
>>> >> Here a decision tree is build. So my doubt is
>>> >> Can we also include the prediction along with  that?
>>> >>
>>> >>
>>> >> On Tue, Nov 26, 2013 at 8:52 AM, Yexi Jiang <yexijiang@gmail.com>
>>> wrote:
>>> >>
>>> >>> You are welcome :)
>>> >>>
>>> >>>
>>> >>> 2013/11/25 unmesha sreeveni <unmeshabiju@gmail.com>
>>> >>>
>>> >>>> ok . Thx Yexi
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Nov 26, 2013 at 1:41 AM, Yexi Jiang <yexijiang@gmail.com>
>>> >>>> wrote:
>>> >>>>
>>> >>>>> As far as I know, there is no ID3 implementation in mahout
>>> currently,
>>> >>>>> but you can use the decision forest instead.
>>> >>>>> https://cwiki.apache.org/confluence/display/MAHOUT/Breiman+Example
>>> .
>>> >>>>>
>>> >>>>>
>>> >>>>> 2013/11/25 unmesha sreeveni <unmeshabiju@gmail.com>
>>> >>>>>
>>> >>>>>> Is that ID3 classification?
>>> >>>>>> It includes prediction also?
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On Sat, Nov 23, 2013 at 9:01 PM, Yexi Jiang
>>> >>>>>> <yexijiang@gmail.com>wrote:
>>> >>>>>>
>>> >>>>>>> You can directly find it at https://github.com/apache/mahout,
>>> or you
>>> >>>>>>> can check out from svn by following
>>> >>>>>>>
>>> https://cwiki.apache.org/confluence/display/MAHOUT/Version+Control.
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> 2013/11/23 unmesha sreeveni <unmeshabiju@gmail.com>
>>> >>>>>>>
>>> >>>>>>>>  I want to go through Decision tree implementation
in mahout.
>>> >>>>>>>> Refereed Apache Mahout <http://mahout.apache.org/>
>>> >>>>>>>>
>>> >>>>>>>> 6 Feb 2012 - Apache Mahout 0.6 released
>>> >>>>>>>> Apache Mahout has reached version 0.6. All developers
are
>>> encouraged
>>> >>>>>>>> to begin using version 0.6. Highlights include:
>>> >>>>>>>> Improved Decision Tree performance and added
support for
>>> regression
>>> >>>>>>>> problems
>>> >>>>>>>>
>>> >>>>>>>> Where can I find its source code and documentation.
>>> >>>>>>>>
>>> >>>>>>>> Should I download mahout
>>> >>>>>>>>
>>> >>>>>>>> --
>>> >>>>>>>> *Thanks & Regards*
>>> >>>>>>>>
>>> >>>>>>>> Unmesha Sreeveni U.B
>>> >>>>>>>>
>>> >>>>>>>> *Junior Developer*
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> ------
>>> >>>>>>> Yexi Jiang,
>>> >>>>>>> ECS 251,  yjian004@cs.fiu.edu
>>> >>>>>>> School of Computer and Information Science,
>>> >>>>>>> Florida International University
>>> >>>>>>> Homepage: http://users.cis.fiu.edu/~yjian004/
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> --
>>> >>>>>> *Thanks & Regards*
>>> >>>>>>
>>> >>>>>> Unmesha Sreeveni U.B
>>> >>>>>>
>>> >>>>>> *Junior Developer*
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> --
>>> >>>>> ------
>>> >>>>> Yexi Jiang,
>>> >>>>> ECS 251,  yjian004@cs.fiu.edu
>>> >>>>> School of Computer and Information Science,
>>> >>>>> Florida International University
>>> >>>>> Homepage: http://users.cis.fiu.edu/~yjian004/
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> *Thanks & Regards*
>>> >>>>
>>> >>>> Unmesha Sreeveni U.B
>>> >>>>
>>> >>>> *Junior Developer*
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> ------
>>> >>> Yexi Jiang,
>>> >>> ECS 251,  yjian004@cs.fiu.edu
>>> >>> School of Computer and Information Science,
>>> >>> Florida International University
>>> >>> Homepage: http://users.cis.fiu.edu/~yjian004/
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >> --
>>> >> *Thanks & Regards*
>>> >>
>>> >> Unmesha Sreeveni U.B
>>> >>
>>> >> *Junior Developer*
>>> >>
>>> >>
>>> >>
>>> >
>>> >
>>> > --
>>> > ------
>>> > Yexi Jiang,
>>> > ECS 251,  yjian004@cs.fiu.edu
>>> > School of Computer and Information Science,
>>> > Florida International University
>>> > Homepage: http://users.cis.fiu.edu/~yjian004/
>>> >
>>>
>>>
>>> --
>>> *Thanks & Regards*
>>>
>>> Unmesha Sreeveni U.B
>>>
>>> *Junior Developer*
>>>
>>
>>
>>
>> --
>> ------
>> Yexi Jiang,
>> ECS 251,  yjian004@cs.fiu.edu
>> School of Computer and Information Science,
>> Florida International University
>> Homepage: http://users.cis.fiu.edu/~yjian004/
>>
>>
>
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
>
>
>


-- 
------
Yexi Jiang,
ECS 251,  yjian004@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/

Mime
View raw message