hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Da Zheng <zhengda1...@gmail.com>
Subject monitor the hadoop cluster
Date Thu, 11 Nov 2010 19:52:23 GMT

I wrote a MapReduce program and ran it on a 3-node hadoop cluster, but 
its running time varies a lot, from 2 minutes to 3 minutes. I want to 
understand how time is used by the map phase and the reduce phase, and 
hope to find the place to improve the performance.

Also the current input data is sorted, so I wrote a customized 
partitioner to reduce the data shuffling across the network. I need some 
means to help me observe the data movement.

I know hadoop community developed chukwa for monitoring, but it seems 
very immature right now. I wonder how people monitor hadoop cluster 
right now. Is there a good way to solve my problems listed above?


View raw message