hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kenyh <ken.yihan1...@gmail.com>
Subject MultithreadedMapper
Date Thu, 26 Jul 2012 05:47:18 GMT

Multithread Mapreduce introduces multithread execution in map task. In hadoop
1.0.2, MultithreadedMapper implements multithread execution in mapper
function. But I found that synchronization is needed for record reading(read
the input Key and Value) and result output. This contention brings heavy
overhead in performance, which increase 50MB wordcount task execution from
40 seconds to 1 minute. I wonder if there are any optimization about the
multithread mapper to decrease the contention of input reading and output? 
-- 
View this message in context: http://old.nabble.com/MultithreadedMapper-tp34213805p34213805.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.


Mime
View raw message