Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2D8DF2004F5 for ; Fri, 1 Sep 2017 15:37:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 2BFF016D045; Fri, 1 Sep 2017 13:37:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7301A16D03F for ; Fri, 1 Sep 2017 15:37:08 +0200 (CEST) Received: (qmail 71107 invoked by uid 500); 1 Sep 2017 13:37:07 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 71094 invoked by uid 99); 1 Sep 2017 13:37:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Sep 2017 13:37:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 9A79AD133E for ; Fri, 1 Sep 2017 13:37:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id KQ7X1sgVsWsS for ; Fri, 1 Sep 2017 13:37:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 8DF7260CE6 for ; Fri, 1 Sep 2017 13:37:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B5DFBE02FD for ; Fri, 1 Sep 2017 13:37:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 6A0DB24147 for ; Fri, 1 Sep 2017 13:37:00 +0000 (UTC) Date: Fri, 1 Sep 2017 13:37:00 +0000 (UTC) From: "Rohith Sharma K S (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (YARN-7147) ATS1.5 crash due to OOM MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 01 Sep 2017 13:37:09 -0000 Rohith Sharma K S created YARN-7147: --------------------------------------- Summary: ATS1.5 crash due to OOM Key: YARN-7147 URL: https://issues.apache.org/jira/browse/YARN-7147 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S It is observed that in production cluster, though _app-cache-size_ is set to minimal i.e less than 5, ATS server is going down with OOM. The _entity-group-fs-store.cache-store-class_ is configured with MemoryTimelineStore which is by default. The heap size configured for ATS daemon is 8GB. This is because ATS parse the entity log file per domain and caches it. If the domain has lot of entity information, then in memory cache store loads all the entity information which is causing OOM. After restart, again it caches same domain and goes OOM. There are possible way handle it are # threshold the number of entities loaded into in memory cache. This still can lead to OOM if data size is huge. # Based on the data size in the store. We faced 1st issue where number of entities are very huge. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org