hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shashikant Banerjee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13102) Implement SnapshotSkipList class to store Multi level DirectoryDiffs
Date Tue, 20 Feb 2018 18:18:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16370391#comment-16370391
] 

Shashikant Banerjee commented on HDFS-13102:
--------------------------------------------

Thanks Nicholas for the Review. 

There are some issues for which I feel we should maintain a list maintaining the skip indices
. I think its better to have a call sometime tomorrow.

If we keep the skipIndices maintained in a list, the power logic will also work..

I do agree that the addFirst method won't work in the current scenario and I would like to
discuss with you on this part as how to handle this as this will be called when the nameNode
starts up..So it may require a different handling.

Let me know in case you are available tomorrow any time.

Thanks
Shashi

On 2/20/18, 11:36 PM, "Tsz Wo Nicholas Sze (JIRA)" <jira@apache.org> wrote:

    
        [ https://issues.apache.org/jira/browse/HDFS-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16370378#comment-16370378
] 
    
    Tsz Wo Nicholas Sze commented on HDFS-13102:
    --------------------------------------------
    
    Some more comments:
    
    - There seems a bug in addFirst -- it should add at index 0, i.e. skipNodeList.add(0,
node).  Then, checkAndPromoteIfNeeded() won't work for it.
    
    - With remove, we cannot use power to determine the skip indices.  I understand that remove()
is not implemented here.  Are you going to change the computation in combineDiffs() when adding
remove()?
    {code}
    //combineDiffs()
          // At each level no of entries to be combined to promote to a
          // higher level will be equal to skip interval, eg: assuming skip interval
          // of 4, at level 0, s0, s1 ,s2 and s3 will be combined to form s0-3.
          // similarly, s4-7, s8-11 and s11-15 will be constructed at level 1.
          // At level 1, s0-3, s4-7, s8-11, s11-15 will be combined to construct
          // s0-15 and so on.
          Double power = Math.pow(skipInterval, levelIterator);
    {code}
    
    
    > Implement SnapshotSkipList class to store Multi level DirectoryDiffs
    > --------------------------------------------------------------------
    >
    >                 Key: HDFS-13102
    >                 URL: https://issues.apache.org/jira/browse/HDFS-13102
    >             Project: Hadoop HDFS
    >          Issue Type: Improvement
    >            Reporter: Shashikant Banerjee
    >            Assignee: Shashikant Banerjee
    >            Priority: Major
    >         Attachments: HDFS-13102.001.patch, HDFS-13102.002.patch, HDFS-13102.003.patch
    >
    >
    > HDFS-11225 explains an issue where deletion of older snapshots can take a very long
time in case the no of snapshot diffs is quite large for directories. For any directory under
a snapshot, to construct the children list , it needs to combine all the diffs from that particular
snapshot to the last snapshotDiff record and reverseApply to the current children list of
the directory on live fs. This can take  a significant time if the no of snapshot diffs are
quite large and changes per diff is significant.
    > This Jira proposes to store the Directory diffs in a SnapshotSkip list, where we
store multi level DirectoryDiffs. At each level, the Directory Diff will be cumulative diff
of k snapshot diffs,
    > where k is the level of a node in the list. 
    >  
    
    
    
    --
    This message was sent by Atlassian JIRA
    (v7.6.3#76005)


> Implement SnapshotSkipList class to store Multi level DirectoryDiffs
> --------------------------------------------------------------------
>
>                 Key: HDFS-13102
>                 URL: https://issues.apache.org/jira/browse/HDFS-13102
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>            Priority: Major
>         Attachments: HDFS-13102.001.patch, HDFS-13102.002.patch, HDFS-13102.003.patch
>
>
> HDFS-11225 explains an issue where deletion of older snapshots can take a very long time
in case the no of snapshot diffs is quite large for directories. For any directory under a
snapshot, to construct the children list , it needs to combine all the diffs from that particular
snapshot to the last snapshotDiff record and reverseApply to the current children list of
the directory on live fs. This can take  a significant time if the no of snapshot diffs are
quite large and changes per diff is significant.
> This Jira proposes to store the Directory diffs in a SnapshotSkip list, where we store
multi level DirectoryDiffs. At each level, the Directory Diff will be cumulative diff of k
snapshot diffs,
> where k is the level of a node in the list. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message