hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi" <runp...@yahoo-inc.com>
Subject anybody use stream combiner feature?
Date Fri, 20 Apr 2007 23:08:59 GMT

The in current framework, each mapper task will create one combiner object
per partition per spill. 

This is very costly, since each time a combiner is created, a new process is
actually created to execute the 

combiner executable. I suspect a job with a stream combiner may not run much
faster than one without it.

It may even be slower. Thus, I doubt the value of supporting such a feature.

I want to know who use stream combiners in real applications and how they
use them. 

Whether these uses can be satisfied by the framework  providing a set of
generic combiners (such as Abacus)?






  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message