hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bejoy Ks <bejoy.had...@gmail.com>
Subject Re: Multiple Mappers for Multiple Tables
Date Mon, 05 Dec 2011 20:06:15 GMT
        If I get your requirement right you need to get in data from
multiple rdbms sources and do a join on the same, also may be some more
custom operations on top of this. For this you don't need to go in for
writing your custom mapreduce code unless it is that required. You can
achieve the same in two easy steps
- Import data from RDBMS into Hive using SQOOP (Import)
- Use hive to do some join and processing on this data

Hope it helps!..


On Tue, Dec 6, 2011 at 12:13 AM, Justin Vincent <justinvf@gmail.com> wrote:

> I would like join some db tables, possibly from different databases, in a
> MR job.
> I would essentially like to use MultipleInputs, but that seems file
> oriented. I need a different mapper for each db table.
> Suggestions?
> Thanks!
> Justin Vincent

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message