hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <decho...@gmail.com>
Subject Re: Programming Question / Joining Dataset
Date Wed, 26 Sep 2012 13:42:55 GMT
A container type with generics in order to help developers.
It works but it is code that you would rather not write and maintain.
That's why solutions with a higher abstraction are required.
Like always, "Premature optimization is the root of all evil" and in most
cases the safer bet is hive, pig, cascading...

Regards

Bertrand

On Wed, Sep 26, 2012 at 3:39 PM, Oliver B. Fischer <mailsink@swe-blog.net>wrote:

> Yes I know Hive and also Pig. Both are suitable for my problems but before
> starting with one of them I simply would like to know how to do it with
> pure MR. ;-)
>
> Bye,
>
> Oliver
>
>
> On 09/26/2012 03:36 PM, bharath vissapragada wrote:
>
>> Have you seen Hive[1] ? It can join DataSets over mapreduce . Also you
>> can provide your custom SerDes, to read your file format (to avoid
>> pre-processing) and also create your own data types, (For eg: Map of
>> Maps,Arrays etc)
>>
>> [1] https://cwiki.apache.org/Hive/**home.html<https://cwiki.apache.org/Hive/home.html>
>>
>> On Wed, Sep 26, 2012 at 6:49 PM, Oliver B. Fischer
>> <mailsink@swe-blog.net <mailto:mailsink@swe-blog.net>**> wrote:
>>
>>     Hi all,
>>
>>     I have to join to large datasets A and B. I preprocess both datasets
>>     by parsing the source text files and creating custom datatypes ADT
>>     and BDT out ouf it.
>>
>>     Now I have to join theses data. Both databsets A' and B' already
>>     have the same datatype as key. But how can I pass both custom
>>     datatypes ADT and BDT to the same reducer instance for joining?
>>
>>     Bye,
>>
>>     Oliver
>>
>>
>>
>>
>> --
>> Regards,
>> Bharath .V
>> w:http://researchweb.iiit.ac.**in/~bharath.v<http://researchweb.iiit.ac.in/~bharath.v>
>> <http://researchweb.iiit.ac.**in/%7Ebharath.v<http://researchweb.iiit.ac.in/%7Ebharath.v>
>> >
>>
>


-- 
Bertrand Dechoux

Mime
View raw message