Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C82BD200C7E for ; Tue, 23 May 2017 21:10:10 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C6CF8160BC3; Tue, 23 May 2017 19:10:10 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1B081160BA4 for ; Tue, 23 May 2017 21:10:09 +0200 (CEST) Received: (qmail 134 invoked by uid 500); 23 May 2017 19:10:09 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 122 invoked by uid 99); 23 May 2017 19:10:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 May 2017 19:10:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id C8BA91A0315 for ; Tue, 23 May 2017 19:10:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id gwoEUiGLJfRk for ; Tue, 23 May 2017 19:10:08 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id D926660CCB for ; Tue, 23 May 2017 19:10:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id E72DAE0DD2 for ; Tue, 23 May 2017 19:10:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id C858C2408A for ; Tue, 23 May 2017 19:10:04 +0000 (UTC) Date: Tue, 23 May 2017 19:10:04 +0000 (UTC) From: "Wei-Chiu Chuang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-11661) GetContentSummary uses excessive amounts of memory MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 23 May 2017 19:10:11 -0000 [ https://issues.apache.org/jira/browse/HDFS-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16021680#comment-16021680 ] Wei-Chiu Chuang commented on HDFS-11661: ---------------------------------------- Heads up I am going to revert both commits EOD. I believe I've got sufficient number of upvotes. > GetContentSummary uses excessive amounts of memory > -------------------------------------------------- > > Key: HDFS-11661 > URL: https://issues.apache.org/jira/browse/HDFS-11661 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.8.0, 3.0.0-alpha2 > Reporter: Nathan Roberts > Assignee: Wei-Chiu Chuang > Priority: Blocker > Attachments: HDFS-11661.001.patch, HDFs-11661.002.patch, Heap growth.png > > > ContentSummaryComputationContext::nodeIncluded() is being used to keep track of all INodes visited during the current content summary calculation. This can be all of the INodes in the filesystem, making for a VERY large hash table. This simply won't work on large filesystems. > We noticed this after upgrading a namenode with ~100Million filesystem objects was spending significantly more time in GC. Fortunately this system had some memory breathing room, other clusters we have will not run with this additional demand on memory. > This was added as part of HDFS-10797 as a way of keeping track of INodes that have already been accounted for - to avoid double counting. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org