hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Chou <timchou....@gmail.com>
Subject Re: How can I get the memory usage in Namenode and Datanode?
Date Sun, 22 Feb 2015 06:57:34 GMT
Hi Jonathan,

Very useful information. I will look at the ganglia.

However, I do not have the administrative privilege for the cluster. I
don't know if I can install Ganglia in the cluster.

Thank you for your information.

Best,
Tim

2015-02-22 0:53 GMT-06:00 Jonathan Aquilina <jaquilina@eagleeyet.net>:

>  Where I am working we are working on transient cluster (temporary) using
> Amazon EMR. When I was reading up on how things work they suggested for
> monitoring to use ganglia to monitor memory usage and network usage etc.
> That way depending on how things are setup be it using an amazon s3 bucket
> for example and pulling data directly into the cluster the network link
> will always be saturated to ensure a constant flow of data.
>
> What I am suggesting is potentially looking at ganglia.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-02-22 07:42, Fang Zhou wrote:
>
> Hi Jonathan,
>
> Thank you.
>
> The number of files impact on the memory usage in Namenode.
>
> I just want to get the real memory usage situation in Namenode.
>
> The memory used in heap always changes so that I have no idea about which
> value is the right one.
>
> Thanks,
> Tim
>
>  On Feb 22, 2015, at 12:22 AM, Jonathan Aquilina <jaquilina@eagleeyet.net>
> wrote:
>
>  I am rather new to hadoop, but wouldnt the difference be potentially in
> how the files are split in terms of size?
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-02-21 21:54, Fang Zhou wrote:
>
> Hi All,
>
> I want to test the memory usage on Namenode and Datanode.
>
> I try to use jmap, jstat, proc/pid/stat, top, ps aux, and Hadoop website interface to
check the memory.
> The values I get from them are different. I also found that the memory always changes
periodically.
> This is the first thing confused me.
>
> I thought the more files stored in Namenode, the more memory usage in Namenode and Datanode.
> I also thought the memory used in Namenode should be larger than the memory used in each
Datanode.
> However, some results show my ideas are wrong.
> For example, I test the memory usage of Namenode with 6000 and 1000 files.
> The "6000" memory is less than "1000" memory from jmap's results.
> I also found that the memory usage in Datanode is larger than the memory used in Namenode.
>
> I really don't know how to get the memory usage in Namenode and Datanode.
>
> Can anyone give me some advices?
>
> Thanks,
> Tim
>
>

Mime
View raw message