hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MATTHEW DOERING <doer...@optonline.net>
Subject RE: Applications creates bigger output than input?
Date Fri, 29 Apr 2011 14:51:23 GMT
If you are looking for exponential output volumes then run a  market 
basket test.  You will need only a few thousand baskets with 100 items 
or so.  Set min support very low (5%) and confidence at 80% and you 
should get a flood of data.  Any FIS algorythm should do the job.


On Fri, Apr 29, 2011 at 8:02 AM, elton sky wrote:

> One of assumptions map reduce made, I think, is that size of map's 
> output is
> smaller than input. Although we can see many applications have the 
> same size
> of output with input, like, sort, merge,etc.
> For my benchmark purpose, I am looking for some non-trivial, real life
> applications which creates *bigger* output than its input. Trivial 
> example I
> can think about is cross join...
>
> I really appreciate if you share your knowledge with me.

Mime
View raw message