Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: local policy)
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Mac OS X Mail 6.1 \(1498\))
Subject: Re: Programming Question / Joining Dataset
From: Kai Voigt <k@123.org>
In-Reply-To: <5063059E.7040707@swe-blog.net>
Date: Wed, 26 Sep 2012 15:42:06 +0200
Cc: bharath vissapragada <bharathvissapragada1990@gmail.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <1CFEFEE5-3927-44C5-A08F-CBE8FDEC23E3@123.org>
References: <506300DE.1030606@swe-blog.net>
 <CAK3hZ7T=JEB20NT60UGfnCHbw_9D6_X0hP64wU5a6b78s3THZg@mail.gmail.com>
 <5063059E.7040707@swe-blog.net>
To: user@hadoop.apache.org

The design pattern for this is called "Reduce-side Join". Enter it into =
Google and you will get a lot of details.

Kai

Am 26.09.2012 um 15:39 schrieb "Oliver B. Fischer" =
<mailsink@swe-blog.net>:

> Yes I know Hive and also Pig. Both are suitable for my problems but =
before starting with one of them I simply would like to know how to do =
it with pure MR. ;-)
>=20
> Bye,
>=20
> Oliver
>=20
> On 09/26/2012 03:36 PM, bharath vissapragada wrote:
>> Have you seen Hive[1] ? It can join DataSets over mapreduce . Also =
you
>> can provide your custom SerDes, to read your file format (to avoid
>> pre-processing) and also create your own data types, (For eg: Map of
>> Maps,Arrays etc)
>>=20
>> [1] https://cwiki.apache.org/Hive/home.html
>>=20
>> On Wed, Sep 26, 2012 at 6:49 PM, Oliver B. Fischer
>> <mailsink@swe-blog.net <mailto:mailsink@swe-blog.net>> wrote:
>>=20
>>    Hi all,
>>=20
>>    I have to join to large datasets A and B. I preprocess both =
datasets
>>    by parsing the source text files and creating custom datatypes ADT
>>    and BDT out ouf it.
>>=20
>>    Now I have to join theses data. Both databsets A' and B' already
>>    have the same datatype as key. But how can I pass both custom
>>    datatypes ADT and BDT to the same reducer instance for joining?
>>=20
>>    Bye,
>>=20
>>    Oliver
>>=20
>>=20
>>=20
>>=20
>> --
>> Regards,
>> Bharath .V
>> w:http://researchweb.iiit.ac.in/~bharath.v
>> <http://researchweb.iiit.ac.in/%7Ebharath.v>
>=20

--=20
Kai Voigt
k@123.org