hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Anchlia <mohitanch...@gmail.com>
Subject Re: WholeFileInputFormat in hadoop
Date Mon, 30 Jun 2014 20:14:58 GMT
Have you looked at this post:

http://stackoverflow.com/questions/15863566/need-assistance-with-implementing-dbscan-on-map-reduce/15863699#15863699

On Sun, Jun 29, 2014 at 9:01 PM, unmesha sreeveni <unmeshabiju@gmail.com>
wrote:

> I am trying to do DBScan Algo.I refered the algo in "Data Mining -
> Concepts and Techniques (3rd Ed)" chapter 10 Page no: 474.
> Here in this algorithmwe need to find the disance between each point.
> say my sample input is
> 5,6
> 8,2
> 4,5
> 4,6
>
> So in DBScan we have to pic 1 elemnt and then find the distance between
> all.
>
> While implementing so I will not be able to get the whole file in map
> inorder to find the distance.
> I tried some approach
> 1. used WholeFileInput and done the entire algorithm in Map itself - I dnt
> think this is a better one.(And it end up with heap space error)
> 2. and this one is not implementes as I thought it is not feasible
>   - Reading 1 line of input data set in driver and write to a new
> file.(say centroid)
>  - this centriod can be read in setup and calculate the distance in Map
> and emit the data which satifies the condition with dbscan
> map(id,epsilonneighbr) and in reducer we will be able to aggregate all the
> epsilon neighbours of (5,6) which come from different map and in Reducer
> find the neighbors of epsilon neighbour.
>  - Next iteration should also be done agian read the input file find a
> node which is not visited....
> If the input is a 1GB file the MR job executes as many times of the total
> record.
>
>
> Can anyone suggest me a better way to do this.
>
> Hope the usecase is understandable else please tell me.I will explain
> further.
>
>
> --
> *Thanks & Regards *
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>

Mime
View raw message