hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [hadoop] steveloughran commented on a change in pull request #1229: HADOOP-16490. Improve S3Guard handling of FNFEs in copy
Date Tue, 20 Aug 2019 23:05:11 GMT
steveloughran commented on a change in pull request #1229: HADOOP-16490. Improve S3Guard handling
of FNFEs in copy
URL: https://github.com/apache/hadoop/pull/1229#discussion_r315939643

 File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 @@ -2603,6 +2607,30 @@ S3AFileStatus innerGetFileStatus(final Path f,
     final Path path = qualify(f);
+    return resolveFileStatus(path, needEmptyDirectoryFlag, false);
+  }
+  /**
+   * Get the status of a file or directory, first through S3Guard and then
+   * through S3.
+   * The S3 probes can leave 404 responses in the S3 load balancers; if
+   * a check is only needed for a directory, declaring this saves time and
+   * avoids creating one for the object.
+   * When only probing for directories, if an entry for a file is found in
+   * S3Guard it is returned, but checks for updated values are skipped.
+   * @param path fully qualified path
+   * @param needEmptyDirectoryFlag if true, implementation will calculate
+   *        a TRUE or FALSE value for {@link S3AFileStatus#isEmptyDirectory()}
+   * @param onlyProbeForDirectory skip the simple object probe
+   * @return a S3AFileStatus object
+   * @throws FileNotFoundException when the path does not exist
+   * @throws IOException on other problems.
+   */
+  private S3AFileStatus resolveFileStatus(final Path path,
 Review comment:
   I'm avoiding that now for what is really a bug fix, especially as we need a good layering
design first. Yes, the s3* calls are clearly at the bottom, but unwinding them will take more
effort than I want to. I've stuck the new probe enum into the .impl package though.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message