hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: mapper combiner and partitioner for particular dataset
Date Sun, 03 Mar 2013 11:28:38 GMT
The MultipleInputs class only supports mapper configuration per dataset. It
does not let you specify a partitioner and combiner as well. You will need
a custom written "high level" partitioner and combiner that can create
multiple instances of sub-partitioners/combiners and use the most likely
one based on their input's characteristics (such as instance type, some
tag, config., etc.).


On Sun, Mar 3, 2013 at 4:07 PM, Vikas Jadhav <vikascjadhav87@gmail.com>wrote:

>
>
>
>
> Hello
>
> 1)  I have multiple types of datasets as input to my hadoop job
>
> i want write my own inputformat (Exa. MyTableInputformat)
>   and how to specify mapper partitioner combiner per dataset manner
>  I know MultiFileInputFormat class but if i want to asscoite combiner and
> partitioner class
> it wont help. it only sets mapper class for per dataset manner.
>
> 2)  Also i am looking MapTask.java file from source code
>
> just want to know where does mapper partitioner and combiner classes are
> set for particular filesplit
> while executing job
>
> Thank You
>
> --
> *
> *
> *
>
>  Thanx and Regards*
> * Vikas Jadhav*
>
>
>
> --
> *
> *
> *
>
> Thanx and Regards*
> * Vikas Jadhav*
>



-- 
Harsh J

Mime
View raw message