hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Lewis <lordjoe2...@gmail.com>
Subject Algorithm for cross product
Date Wed, 22 Jun 2011 22:16:02 GMT
Assume I have two data sources A and B
Assume I have an input format and can generate key values for both A and B
I want an algorithm which will generate the cross product of all values in A
having the key K and all values in B having the
key K.
Currently I use a mapper to generate key values for A and  have the reducer
get all values in B with key K and hold them in memory.
It works but might not scale.

Any bright ideas?

Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

View raw message