hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manoj Govindassamy (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-12544) SnapshotDiff - support diff generation on any snapshot root descendant directory
Date Wed, 11 Oct 2017 00:01:32 GMT

     [ https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Manoj Govindassamy updated HDFS-12544:
    Attachment: HDFS-12544.03.patch

Thanks for the review [~yzhangal]. Attached v03 patch to address the following comments. Can
you please review the latest patch?

bq. It seems to make sense to include a new field snapshotDiffScopeDir in the SnapshotDiffInfo
class, and initialize it as the constructor. 

bq. suggest to move the checking from SnapshotManager%getSnapshottableAncestorDir to its caller,

bq. suggest to remove the method SnapshotManager%setSnapshotDiffAllowSnapRootDescendant, and
use the config property to pass on the value to the cluster..

bq. Nit. In SnapshotManager.java, change "directories" to "directory" in the following text...

> SnapshotDiff - support diff generation on any snapshot root descendant directory
> --------------------------------------------------------------------------------
>                 Key: HDFS-12544
>                 URL: https://issues.apache.org/jira/browse/HDFS-12544
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 3.0.0-beta1
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>         Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch, HDFS-12544.03.patch
> {noformat}
> # hdfs snapshotDiff <snapshot_root_path> <from_snapshot_name> <to_snapshot_name>
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two given snapshots
under a snapshot root directory. The command today only accepts the path that is a snapshot
root. There are many deployments where the snapshot root is configured at the higher level
directory but the diff report needed is only for a specific directory under the snapshot root.
In these cases, the diff report can be filtered for changes pertaining to the directory we
are interested in. But when the snapshot root directory is very huge, the snapshot diff report
generation can take minutes even if we are interested to know the changes only in a small
directory. So, it would be highly performant if the diff report calculation can be limited
to only the interesting sub-directory of the snapshot root instead of the whole snapshot root.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message