hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285
Date Thu, 14 Dec 2017 17:29:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16291186#comment-16291186
] 

Uma Maheswara Rao G edited comment on HDFS-12911 at 12/14/17 5:28 PM:
----------------------------------------------------------------------

[~chris.douglas], Thank you for all the valuable suggestions. We are working on to reduce
the lock and other optimizations. For lock improvement,  [~rakeshr] is almost ready with Patch.
Once tested he will post.

For the above mem optimizations, I am trying to profile. 
Some updates: 
For UMA.2: Actually we are storing the matters in Feature as byte array. So, having just Xattr
created obj static and reusing will not give benefit, that could reduce obj for GC though.
It is removing the existing added feature and adding again, it only stores as objects if Xattrs
values length > threshold, i.e 1024. However, reducing xattr key name will have less space
occupied.

Overall with my perf number locally shows no significant growth because of SPS. For 5 million
filed up SPS calls, it is approximately taking 240MB ( for front Q and Xattr). After few of
above improvement, it could reduce further. Also, IMO, 5 million directories means, it could
cover whole cluster as users most preferably call SPS on directories where storage policy
set. Consuming from front Q is throttled, so memory will be fixed from that part of processing.


was (Author: umamaheswararao):
[~chris.douglas], Thank you for all the valuable suggestions. We are working on to reduce
the lock and other optimizations. For lock improvement,  [~rakeshr] is almost ready with Patch.
Once tested he will post.

For the above mem optimizations, I am trying to profile. 
Some updates: 
For UMA.2: Actually we are storing the matters in Feature as byte array. So, having just Xattr
created obj static and reusing will not give benefit, that could reduce obj for GC though.
Another improvement could be creating buffer pool and reuse, as when new Xattr added, it is
removing the existing added feature and adding again. However, reducing xattr key name will
have less space occupied.

Overall with my perf number locally shows no significant growth because of SPS. For 5 million
filed up SPS calls, it is approximately taking 240MB ( for front Q and Xattr). After few of
above improvement, it could reduce further. Also, IMO, 5 million directories means, it could
cover whole cluster as users most preferably call SPS on directories where storage policy
set. Consuming from front Q is throttled, so memory will be fixed from that part of processing.

> [SPS]: Fix review comments from discussions in HDFS-10285
> ---------------------------------------------------------
>
>                 Key: HDFS-12911
>                 URL: https://issues.apache.org/jira/browse/HDFS-12911
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>            Reporter: Uma Maheswara Rao G
>            Assignee: Rakesh R
>         Attachments: HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature disabled. This
could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object once statically
and use the same object reference when required to add SPS Xattr to Inode. So, here additional
bytes required for storing SPS Xattr would turn to same as single object ref ( i.e 4 bytes
in 32 bit). So Xattr overhead should come down significantly IMO. Lets explore the feasibility
on this option.
> Xattr list Future will not be specially created for SPS, that list would have been created
by SetStoragePolicy already on the same directory. So, no extra Feature creation because of
SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. So, it is
additional created and size of it would be (obj ref + value) = (8 + 8) bytes [ ignoring alignment
for time being]
> So, the possible improvement here is, instead of creating new Long obj, we can keep existing
inode object for tracking. Advantage is, Inode object already maintained in NN, so no new
object creation is needed. So, we just need to maintain one obj ref. Above two points should
significantly reduce the memory requirements of SPS. So, for SPS call: 8bytes for called inode
tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will reduce unnecessary
Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message