hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7456) De-duplicate AclFeature instances with same AclEntries do reduce memory footprint of NameNode
Date Fri, 05 Dec 2014 19:34:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235966#comment-14235966

Chris Nauroth commented on HDFS-7456:

I think this needs a rebase and some additional changes after HDFS-7454 went in.  Here is
a review of the current patch:
# {{AclFeature}}: Now that {{entries}} is an {{int[]}}, we need to update {{equals}} and {{hashCode}}
to use {{Arrays#equals}} and {{Arrays#hashCode}}.
# {{Interner}}: There are 2 calls to {{WeakReference#get}}.  The first is done for a null
check, and the second is to return the referent.  If there is a garbage collection in between
the 2 calls, then this could cause the method to return {{null}}, and the caller likely would
hit a {{NullPointerException}} some time later.
# I'm looking back at the code for the Guava interner implementation that I was using in the
original HDFS-5620 patch that we abandoned.  That code is shown here: https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/collect/Interners.java
.  I had been critical of the spinlock, but I now realize that this implementation has another
nice benefit.  It uses a dummy value in the map instead of the key wrapped in a {{WeakReference}}.
 This saves on memory, because it avoids the need for the extra layer of pointers going through
{{WeakReference}}.  As a side effect, it needs a spinlock, but in our usage, I expect little
contention on the spin lock.  We would only spin in the presence of multiple concurrent ACL
modification operations.  Reads wouldn't cause a spin.
# The new tests create a few new ACLs and assert that we only create a new instance if needed.
 This is great.  I'd also like to add tests that remove all references to an ACL and then
assert that the instance count goes down.  Of course, that could be a flaky test if we're
relying on the GC, so that leads me to...

In general, my current thinking on this is that we should avoid implementing an interner at
all and instead implement it as a reference-counted set, i.e. {{Map<AclEntry, Long>}}.
 The value type would need to be {{Long}} for agreement with the definition of inode ID. 
Unlike an interner, we'd need to hook code into both creation and removal instead of relying
on weak references and GC to cover removal for us.  Fortunately, that's a very easy thing
for ACLs, because we already have the hook points we need.  {{INodeWithAdditionalFields#addAclFeature}}
can do the add, and {{INodeWithAdditionalFields#removeAclFeature}} can do the remove.  We
know both operations are done under the namesystem write lock, so I don't see any tricky concurrency
challenges.  This might end up being a larger total patch compared to the interner, but I
think it's simpler code overall compared to reasoning through weak references and GC.  As
another benefit, we'll be able to write reliable tests of the removal case, because we'll
be able to get a deterministic count out of the {{Map}}.

Let me know your thoughts on this.  Thanks again for all of your recent ACL patches!

> De-duplicate AclFeature instances with same AclEntries do reduce memory footprint of
> ---------------------------------------------------------------------------------------------
>                 Key: HDFS-7456
>                 URL: https://issues.apache.org/jira/browse/HDFS-7456
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>         Attachments: HDFS-7456-001.patch, HDFS-7456-002.patch
> As discussed  in HDFS-7454 [here|https://issues.apache.org/jira/browse/HDFS-7454?focusedCommentId=14229454&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14229454],
de-duplication of {{AclFeature}} helps in reducing the memory footprint of the namenode.

This message was sent by Atlassian JIRA

View raw message