hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From unmesha sreeveni <unmeshab...@gmail.com>
Subject Re: How to partition a file to smaller size for performing KNN in hadoop mapreduce
Date Thu, 15 Jan 2015 09:05:39 GMT
Is there any way..
Waiting for a reply.I have posted the question every where..but none is
responding back.
I feel like this is the right place to ask doubts. As some of u may came
across the same issue and get stuck.

On Thu, Jan 15, 2015 at 12:34 PM, unmesha sreeveni <unmeshabiju@gmail.com>
wrote:

> Yes, One of my friend is implemeting the same. I know global sharing of
> Data is not possible across Hadoop MapReduce. But I need to check if that
> can be done somehow in hadoop Mapreduce also. Because I found some papers
> in KNN hadoop also.
> And I trying to compare the performance too.
>
> Hope some pointers can help me.
>
>
> On Thu, Jan 15, 2015 at 12:17 PM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
>
>>
>> have you considered implementing using something like spark?  That could
>> be much easier than raw map-reduce
>>
>> On Wed, Jan 14, 2015 at 10:06 PM, unmesha sreeveni <unmeshabiju@gmail.com
>> > wrote:
>>
>>> In KNN like algorithm we need to load model Data into cache for
>>> predicting the records.
>>>
>>> Here is the example for KNN.
>>>
>>>
>>> [image: Inline image 1]
>>>
>>> So if the model will be a large file say1 or 2 GB we will be able to
>>> load them into Distributed cache.
>>>
>>> The one way is to split/partition the model Result into some files and
>>> perform the distance calculation for all records in that file and then find
>>> the min ditance and max occurance of classlabel and predict the outcome.
>>>
>>> How can we parttion the file and perform the operation on these
>>> partition ?
>>>
>>> ie  1 record <Distance> parttition1,partition2,....
>>>      2nd record <Distance> parttition1,partition2,...
>>>
>>> This is what came to my thought.
>>>
>>> Is there any further way.
>>>
>>> Any pointers would help me.
>>>
>>> --
>>> *Thanks & Regards *
>>>
>>>
>>> *Unmesha Sreeveni U.B*
>>> *Hadoop, Bigdata Developer*
>>> *Centre for Cyber Security | Amrita Vishwa Vidyapeetham*
>>> http://www.unmeshasreeveni.blogspot.in/
>>>
>>>
>>>
>>
>
>
> --
> *Thanks & Regards *
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Centre for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Centre for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Mime
View raw message