hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Majid Azimi <majid.merk...@gmail.com>
Subject When reduce function is used as combiner?
Date Fri, 07 Dec 2012 14:01:31 GMT
Hi guys,

When reduce function is used as combiner? It is used as combiner when the
iterable passed to reduce function is large? correct?

Is there any maximum size for that iterable? I mean for example if that
iterable size is more than 1000 then reduce function will be called more
than once for that key.

another question is when reduce function is used as combiner the Input Key,
Value and output Key, Value must be the same. correct? If it is different
what will happen? exception thrown at runtime?

Forth question is: lets say iterable size is very large so hadoop will add
output of reduce to iterable and pass it to reduce again with other values
that have not been processed. The question is when hadoop will now that
from that point output of reduce function should be written to HDFS as a
real output? When there is no more value to put into that iterable?

View raw message