hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luke Lu <...@vicaya.com>
Subject Re: monitor the hadoop cluster
Date Thu, 11 Nov 2010 20:31:46 GMT
The job detail page from the jobtracker shows a lot of information
about any given job: the start/finish times of each task and various
counters (like time spent in various phase, the input/output
bytes/records etc.)

For monitoring the aggregate performance of a cluster, the hadoop
metrics system can send a lot of information to standard monitoring
tools (ganglia etc.) to graph and monitor various aggregate metrics
like (running/waiting maps/reduces etc.)


On Thu, Nov 11, 2010 at 12:23 PM, Da Zheng <zhengda1936@gmail.com> wrote:
> Hello,
> On 11/11/2010 03:00 PM, David Rosenstrauch wrote:
>> On 11/11/2010 02:52 PM, Da Zheng wrote:
>>> Hello,
>>> I wrote a MapReduce program and ran it on a 3-node hadoop cluster, but
>>> its running time varies a lot, from 2 minutes to 3 minutes. I want to
>>> understand how time is used by the map phase and the reduce phase, and
>>> hope to find the place to improve the performance.
>>> Also the current input data is sorted, so I wrote a customized
>>> partitioner to reduce the data shuffling across the network. I need some
>>> means to help me observe the data movement.
>>> I know hadoop community developed chukwa for monitoring, but it seems
>>> very immature right now. I wonder how people monitor hadoop cluster
>>> right now. Is there a good way to solve my problems listed above?
>>> Thanks,
>>> Da
>> Just my $0.02, but IMO you're working on some faulty assumptions here.
>> Hadoop is explicitly *not* a real-time system, and so it's not reasonable
>> for you to expect to have such fine-grained control over its processing
>> speed.  It's a distributed system, where many things can affect how long a
>> job takes, such as:  how many nodes in the cluster, how many other jobs are
>> running, the technical specs of each node, whether/how Hadoop implements
>> "speculative execution" during your job, whether your job as any task
>> failures/retries, whether you have any hardware failures during your job,
>> ......
>> You can have control over performance on a Hadoop cluster, via things like
>> adding nodes, tweaking some config parms, etc.  But you're much more likely
>> to be able to make performance improvements like cutting a job down from 3
>> hours to 2 hours, not from 3 minutes to 2 minutes. You're just not going to
>> get that kind of fine-grained control with Hadoop.  Nor should you be
>> looking for it, IMO.  If that's what you want, then Hadoop is probably the
>> wrong tool for your job.
> I don't really try to cut the time from 3 minutes to 2 minutes. I was asking
> whether I can have some tools to monitor the hadoop cluster and possibly
> find the spot for performance improvement. I'm very new to hadoop, and I
> hope to have a good view how time is used by each mapper and reducer, so
> I'll have more confidence to run it on a much larger dataset.
> More importantly, I want to see how much data shaffling can be saved if I
> use the customized partitioner.
> Best,
> Da

View raw message