hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Stats to look out for while running mapreduce jobs with HBase
Date Fri, 12 Nov 2010 18:56:14 GMT
The most important:

 - no swap, as is zero, none, nada
 - near 0 io wait

Then it's about making sure that you can drive your user CPU to near
100%. If you can't, then you have a bottle neck somewhere and there's
no magical way of finding it out. It usually starts by understanding
what you're doing (is your job mostly just mapping or it's inserting
aggressively?) and then figuring via debugging or log reading what
seems to be the holdup.

J-D

On Thu, Nov 11, 2010 at 9:05 PM, Hari Sreekumar
<hsreekumar@clickable.com> wrote:
> Hi,
>
>       I am quite new to hadoop and hbase, and I am having a hard time here
> figuring out some issues with my cluster, and I am pretty sure many of you
> have gone through many of the problems I am facing right now. I need some
> help in figuring out what exactly are the bottlenecks in my system. I have
> set up regular ganglia on my cluster (simple ganglia, not able to track
> hadoop/hbase metrics yet.. that's another issue). What are the stats that
> matter the most? How to go about making inferences from these reports? I
> know that swapping is a very important parameter to monitor. What are the
> other important parameters, what is their significance, and what should be
> their values ideally be, approximately? Mainly memory cached, cpu loads,
> memory buffered, Total memory, Network usage etc. and also any other
> parameter that you found to be useful in these cases. I think this would be
> very helpful for many people in figuring out many issues. Thanks a ton,
>
> hari
>

Mime
View raw message