Mailing-List: contact dev-help@hive.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hive.apache.org
Date: Wed, 1 Aug 2012 18:22:03 +0000 (UTC)
From: "Mithun Radhakrishnan (JIRA)" <jira@apache.org>
To: hive-dev@hadoop.apache.org
Message-ID: <470068233.1621.1343845323938.JavaMail.jiratomcat@issues-vm>
In-Reply-To: <1368534161.44457.1339007783219.JavaMail.jiratomcat@issues-vm>
Subject: [jira] [Updated] (HIVE-3098) Memory leak from large number of
 FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mithun Radhakrishnan updated HIVE-3098:
---------------------------------------

    Status: Open  (was: Patch Available)
    
> Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)
> ---------------------------------------------------------------------------------------------
>
>                 Key: HIVE-3098
>                 URL: https://issues.apache.org/jira/browse/HIVE-3098
>             Project: Hive
>          Issue Type: Bug
>          Components: Shims
>    Affects Versions: 0.9.0
>         Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security turned on.
>            Reporter: Mithun Radhakrishnan
>            Assignee: Mithun Radhakrishnan
>         Attachments: Hive_3098.patch
>
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 1000000 instances of FileSystem, whose combined retained-mem consumed the entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented such that the "Subject" member is compared for equality ("=="), and not equivalence (".equals()"). This causes equivalent UGI instances to compare as unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another problem (HADOOP-6670); so it is unlikely that that implementation can be modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira