hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: help on hadoop
Date Wed, 25 Jul 2007 16:19:14 GMT

Remember that you can do more than one map/reduce step.

Suppose that you want to implement something that looks like this:

Select f(x), g(y), z from table1 join table2 using (j1, j2) where z > 0

Also assume that table1 and table2 have lots of columns besides x, y and z.

You can implement this with a map-reduce where the map step gets both table1
and table2 as inputs.  The output of the map step will be empty if z <= 0
and will otherwise have (j1, j2) as key and f(x), g(y), z as value.  The
reduce function will get records from table1 and table2 all mixed together,
but grouped according to the join key.  It can combine these into the
desired output.

If you add a "group by y, z" clause, then f has to be a function of a set of
values of x (like max or average, but you get to write it).  You should
change the map function so that the key is now (j1, j2, y, z) and the value
would be x, g(y), z.  Then change the reduce function to collect the values
of x and compute f(x) (and pass through g(y) and z).

Hope this helps.

The key here is that the output can be polymorphic so you can use the sort
phase between map and reduce to do the join.

On 7/25/07 4:05 AM, "meda vijendharreddy" <medavijju@yahoo.co.in> wrote:

> Hi,
>    Iam new to hadoop, Wanted to use hadoop in my
> application.
> Currently I want to simulate something like
> FieldSelectionMapReduce  Class can be used to reduce
> the no of columns(which is like  SELECT blah blah )
> FROM , if I have more than 2 tables, then i can do
> join on those and if it is single table then i can
> acheive  easily.
> How can I acheive the where condition functionality.
> Please help me on this. I have no clues at this
> moment.
> Thanks in Advance,
> -----
> Thanks
> Vijen
>       Once upon a time there was 1 GB storage in your inbox. To know the happy
> ending go to 
> http://help.yahoo.com/l/in/yahoo/mail/yahoomail/tools/tools-08.html

View raw message