hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oliver B. Fischer" <mails...@swe-blog.net>
Subject Re: Programming Question / Joining Dataset
Date Wed, 26 Sep 2012 13:39:42 GMT
Yes I know Hive and also Pig. Both are suitable for my problems but 
before starting with one of them I simply would like to know how to do 
it with pure MR. ;-)

Bye,

Oliver

On 09/26/2012 03:36 PM, bharath vissapragada wrote:
> Have you seen Hive[1] ? It can join DataSets over mapreduce . Also you
> can provide your custom SerDes, to read your file format (to avoid
> pre-processing) and also create your own data types, (For eg: Map of
> Maps,Arrays etc)
>
> [1] https://cwiki.apache.org/Hive/home.html
>
> On Wed, Sep 26, 2012 at 6:49 PM, Oliver B. Fischer
> <mailsink@swe-blog.net <mailto:mailsink@swe-blog.net>> wrote:
>
>     Hi all,
>
>     I have to join to large datasets A and B. I preprocess both datasets
>     by parsing the source text files and creating custom datatypes ADT
>     and BDT out ouf it.
>
>     Now I have to join theses data. Both databsets A' and B' already
>     have the same datatype as key. But how can I pass both custom
>     datatypes ADT and BDT to the same reducer instance for joining?
>
>     Bye,
>
>     Oliver
>
>
>
>
> --
> Regards,
> Bharath .V
> w:http://researchweb.iiit.ac.in/~bharath.v
> <http://researchweb.iiit.ac.in/%7Ebharath.v>

Mime
View raw message