hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jasson Chenwei <ynjassionc...@gmail.com>
Subject Re: How to monitor YARN application memory per container?
Date Thu, 22 Jun 2017 17:21:31 GMT
hi,

Please take a look at Timeline Server 2 which supports aggregate
nodemenager side info into HBase.
These infos include both node level info(e.g., node memory usage,
cpu usage) as well as caontainer(e.g., container memory usage and container
cpu usage ) level info.  I am currently trying to set it up and do find
container related infos stored in HBase.


Wei Chen

On Thu, Jun 22, 2017 at 8:12 AM, Shmuel Blitz <shmuel.blitz@similarweb.com>
wrote:

> Hi,
>
> Thanks for your response.
>
> We are using CDH, and our version doesn't support the solusions above.
> Also, ATS is not relevant for us now.
>
> We have decided to turn on JMX for all our jobs (spark/hadoop map-reduce)
> and use jmap to collect the data and send it to datadog.
>
> Shmuel
>
>
>
> On Thu, Jun 15, 2017 at 9:39 PM, Naganarasimha Garla <
> naganarasimha_gr@apache.org> wrote:
>
>> Container resource usage has been put into ATS v2 metrics system. But if
>> you do not want heavy ATS v2 subsystem, then i am not sure any of the
>> current interface exposing the actual resource usage of the container which
>> solves your problem.
>> Probably i can think of extending this feature in *ContainerManagementProtocol.getContainerStatuses,
>> *so that atleast AM can be aware of the actual container resource
>> usages.
>> Thoughts ?
>>
>> On Thu, Jun 15, 2017 at 7:29 PM, Sunil G <sunilg@apache.org> wrote:
>>
>>> And adding to that, we have aggregated container usage per node. I dont
>>> think you ll have a per container real memory usage recorded from YARN.
>>> You ll have these 2 entries in ideal cases.
>>>
>>> Resource Utilization by Node :
>>> Resource Utilization by Containers : PMem:0 MB, VMem:0 MB, VCores:0.0
>>>
>>> Thanks
>>> Sunil
>>>
>>> On Thu, Jun 15, 2017 at 6:56 AM Sunil G <sunilg@apache.org> wrote:
>>>
>>>> Hi Shmuel
>>>>
>>>> This feature is available in Hadoop 2.8 + release lines. Or Hadoop 3
>>>> alpha's.
>>>>
>>>> Thanks
>>>> Sunil
>>>>
>>>> On Wed, Jun 14, 2017 at 6:31 AM Shmuel Blitz <
>>>> shmuel.blitz@similarweb.com> wrote:
>>>>
>>>>> Hi Sunil,
>>>>>
>>>>> Thanks for your response.
>>>>>
>>>>> Here is the response I get when running  "yarn node -status {nodeId}"
>>>>>  :
>>>>>
>>>>> Node Report :
>>>>>         Node-Id : myNode:4545
>>>>>         Rack : /default
>>>>>         Node-State : RUNNING
>>>>>         Node-Http-Address : muNode:8042
>>>>>         Last-Health-Update : Wed 14/Jun/17 08:25:43:261EST
>>>>>         Health-Report :
>>>>>         Containers : 7
>>>>>         Memory-Used : 44032MB
>>>>>         Memory-Capacity : 49152MB
>>>>>         CPU-Used : 16 vcores
>>>>>         CPU-Capacity : 48 vcores
>>>>>         Node-Labels :
>>>>>
>>>>> However, this is information regarding the entire node, containing all
>>>>> containers.
>>>>>
>>>>> I have no way of using this to see the value I give to '
>>>>> spark.executor.memory' makes sense or not.
>>>>>
>>>>> I'm looking for memory usage/allocated information *per-container*.
>>>>>
>>>>> Shmuel
>>>>>
>>>>> On Wed, Jun 14, 2017 at 4:04 PM, Sunil G <sunilg@apache.org> wrote:
>>>>>
>>>>>> Hi Shmuel
>>>>>>
>>>>>> In Hadoop 2.8 release line, you could check "yarn node -status
>>>>>> {nodeId}" CLI command or "http://<rm http
>>>>>> address:port>/ws/v1/cluster/nodes/{nodeid}" REST end point to
get
>>>>>> container's actual resource usage per node. You could also check
the same
>>>>>> in any of Hadoop 3.0 alpha releases as well.
>>>>>>
>>>>>> Thanks
>>>>>> Sunil
>>>>>>
>>>>>> On Tue, Jun 13, 2017 at 11:29 PM Shmuel Blitz <
>>>>>> shmuel.blitz@similarweb.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Thanks for your response.
>>>>>>>
>>>>>>> The /metrics API returns a blank page on our RM.
>>>>>>>
>>>>>>> The /jmx API has some metrics, but these are the same metrics
we are
>>>>>>> already loading into data-dog.
>>>>>>> It's not good enough, because it doesn't break down the memory
use
>>>>>>> by container.
>>>>>>>
>>>>>>> I need the by-container breakdown because resource allocation
is per
>>>>>>> container and I would like to se if my job is really using up
all the
>>>>>>> allocated memory.
>>>>>>>
>>>>>>> Shmuel
>>>>>>>
>>>>>>> On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar <
>>>>>>> sidharthkumar2707@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx
>>>>>>>> or /metrics
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Sidharth
>>>>>>>> LinkedIn: www.linkedin.com/in/sidharthkumar2792
>>>>>>>>
>>>>>>>> On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <shmuel.blitz@similarweb.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> (This question has also been published on StackOveflow
>>>>>>>>> <https://stackoverflow.com/q/44484940/416300>)
>>>>>>>>>
>>>>>>>>> I am looking for a way to monitor memory usage of YARN
containers
>>>>>>>>> over time.
>>>>>>>>>
>>>>>>>>> Specifically - given a YARN application-id, how can you
get a
>>>>>>>>> graph, showing the memory usage of each of its containers
over time?
>>>>>>>>>
>>>>>>>>> The main goal is to better fit memory allocation requirements
for
>>>>>>>>> our YARN applications (Spark / Map-Reduce), to avoid
over allocation and
>>>>>>>>> cluster resource waste. A side goal would be the ability
to debug memory
>>>>>>>>> issues when developing our jobs and attempting to pick
reasonable resource
>>>>>>>>> allocations.
>>>>>>>>>
>>>>>>>>> We've tried using the Data-Dog integration, But it doesn't
break
>>>>>>>>> down the metrics by container.
>>>>>>>>>
>>>>>>>>> Another approach was to parse the hadoop-yarn logs. These
logs
>>>>>>>>> have messages like:
>>>>>>>>>
>>>>>>>>> Memory usage of ProcessTree 57251 for container-id
>>>>>>>>> container_e116_1495951495692_35134_01_000001: 1.9 GB
of 11 GB
>>>>>>>>> physical memory used; 14.4 GB of 23.1 GB virtual memory
used
>>>>>>>>> Parsing the logs correctly can yield data that can be
used to plot
>>>>>>>>> a graph of memory usage over time.
>>>>>>>>>
>>>>>>>>> That's exactly what we want, but there are two downsides:
>>>>>>>>>
>>>>>>>>> It involves reading human-readable log lines and parsing
them into
>>>>>>>>> numeric data. We'd love to avoid that.
>>>>>>>>> If this data can be consumed otherwise, we're hoping
it'll have
>>>>>>>>> more information that we might be interest in in the
future. We wouldn't
>>>>>>>>> want to put the time into parsing the logs just to realize
we need
>>>>>>>>> something else.
>>>>>>>>> Is there any other way to extract these metrics, either
by
>>>>>>>>> plugging in to an existing producer or by writing a simple
listener?
>>>>>>>>>
>>>>>>>>> Perhaps a whole other approach?
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> [image: Logo]
>>>>>>>>> <https://www.similarweb.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>>> Shmuel Blitz
>>>>>>>>> *Big Data Developer*
>>>>>>>>> www.similarweb.com
>>>>>>>>> <http://www.similarweb.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>>>
>>>>>>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Like
>>>>>>>>> Us
>>>>>>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>>>
>>>>>>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Follow
>>>>>>>>> Us
>>>>>>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>>>
>>>>>>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Watch
>>>>>>>>> Us
>>>>>>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>>>
>>>>>>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Read
>>>>>>>>> Us
>>>>>>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> [image: Logo]
>>>>>>> <https://www.similarweb.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>> Shmuel Blitz
>>>>>>> *Big Data Developer*
>>>>>>> www.similarweb.com
>>>>>>> <http://www.similarweb.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>
>>>>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Like
>>>>>>> Us
>>>>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>
>>>>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Follow
>>>>>>> Us
>>>>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>
>>>>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Watch
>>>>>>> Us
>>>>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>
>>>>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Read
>>>>>>> Us
>>>>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> [image: Logo]
>>>>> <https://www.similarweb.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>> Shmuel Blitz
>>>>> *Big Data Developer*
>>>>> www.similarweb.com
>>>>> <http://www.similarweb.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>
>>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Like
>>>>> Us
>>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>
>>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Follow
>>>>> Us
>>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>
>>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Watch
>>>>> Us
>>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>
>>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Read
>>>>> Us
>>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>>>
>>>>
>>
>
>
> --
> [image: Logo]
> <https://www.similarweb.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> Shmuel Blitz
> *Big Data Developer*
> www.similarweb.com
> <http://www.similarweb.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>
> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Like
> Us
> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>
> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Follow
> Us
> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>
> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Watch
> Us
> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>
> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
Read
> Us
> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>

Mime
View raw message