hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12594) SnapshotDiff - snapshotDiff fails if the snapshotDiff report exceeds the RPC response limit
Date Tue, 21 Nov 2017 21:15:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261492#comment-16261492

Tsz Wo Nicholas Sze commented on HDFS-12594:

Thanks for the update.  Some more comments

- In DiffReportListingEntry,
-* DiffReportListingEntry(byte[][] sourcePath) is not used, please remove it.
-* In getTargetPath(), null check is not redudent, just return targetPath.

- startPath in SnapshotDiffListingInfo and SnapshotDiffReportListing is the start path of
the next report but not the current report.  It actually is the last path of this report.
 Let's rename it to last path.
-* Similarly, index is actually lastIndex.

- The following is the same as {{Snapshot.getSnapshotId(laterSnapshot)}}
laterSnapshot == null ?  Snapshot.getSnapshotId(null) : laterSnapshot.getId())
- Please set hadoop-hdfs-project/hadoop-hdfs-client/dev-support/findbugsExcludeFile.xml to
exclude the findbugs warnings.

> SnapshotDiff - snapshotDiff fails if the snapshotDiff report exceeds the RPC response
> -------------------------------------------------------------------------------------------
>                 Key: HDFS-12594
>                 URL: https://issues.apache.org/jira/browse/HDFS-12594
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>         Attachments: HDFS-12594.001.patch, HDFS-12594.002.patch, HDFS-12594.003.patch,
HDFS-12594.004.patch, HDFS-12594.005.patch, HDFS-12594.006.patch, HDFS-12594.007.patch, HDFS-12594.008.patch,
SnapshotDiff_Improvemnets .pdf
> The snapshotDiff command fails if the snapshotDiff report size is larger than the configuration
value of ipc.maximum.response.length which is by default 128 MB. 
> Worst case, with all Renames ops in sanpshots each with source and target name equal
to MAX_PATH_LEN which is 8k characters, this would result in at 8192 renames.
> SnapshotDiff is currently used by distcp to optimize copy operations and in case of the
the diff report exceeding the limit , it fails with the below exception:
> Test set: org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport
> -------------------------------------------------------------------------------
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 112.095 sec <<<
FAILURE! - in org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport
> testDiffReportWithMillionFiles(org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport)
 Time elapsed: 111.906 sec  <<< ERROR!
> java.io.IOException: Failed on local exception: org.apache.hadoop.ipc.RpcException: RPC
response exceeds maximum data length; Host Details : local host is: "hw15685.local/";
destination host is: "localhost":59808;
> Attached is the proposal for the changes required.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message