hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xinli Shang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12707) key of FileSystem inner class Cache contains UGI.hascode which uses the defualt hascode method, leading to the memory leak
Date Fri, 01 Jun 2018 18:19:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498370#comment-16498370
] 

Xinli Shang commented on HADOOP-12707:
--------------------------------------

We hit this issue also and it impacts us a lot. Our Hadoop cluster is pretty big and Hadoop
security plays a big part of it. So please consider it high priority.

Yes, disabling cache or calling closeAll() would prevent leaking but we lose the benefit of
cache. We would like it to be fixed so that we can have a performant service. 

The use case for us is we have to create proxy user and get FileSystem in doAs(). The code
is as below. 

UserGroupInformation ugi = UserGroupInformation.createProxyUser(proxyUser, UserGroupInformation.getCurrentUser());

fs = ugi.doAs((PrivilegedExceptionAction<FileSystem>) () -> FileSystem.get(conf));

Because ugi is different object even for same proxy user, the FileSystem#Cache#Key would be
different for same proxy user. 

It would be great to fix it. HADOOP-6670 does have a valid reason that mutable object but
simply using identityHashCode() is a bold decision and impact the usage of it. 

 

> key of FileSystem inner class Cache contains UGI.hascode which uses the defualt hascode
method, leading to the memory leak
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-12707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12707
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.7.1
>            Reporter: sunhaitao
>            Assignee: sunhaitao
>            Priority: Major
>
> FileSystem.get(conf) method,By default it will get the fs object from CACHE,But the key
of the CACHE  constains ugi.hashCode, which uses the default hascode method of subject instead
of the hascode method overwritten by subject.
>    @Override
>       public int hashCode() {
>         return (scheme + authority).hashCode() + ugi.hashCode() + (int)unique;
>       }
> In this case, even if same user, if the calll FileSystem.get(conf) twice, two different
key will be created. In long duartion, this will lead to memory leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message