hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [hadoop] ThomasMarquardt commented on a change in pull request #2246: Hadoop-17215. ABFS: Disable default create overwrite
Date Wed, 16 Sep 2020 18:35:43 GMT

ThomasMarquardt commented on a change in pull request #2246:
URL: https://github.com/apache/hadoop/pull/2246#discussion_r489647598



##########
File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
##########
@@ -263,10 +265,102 @@ public AbfsRestOperation deleteFilesystem() throws AzureBlobFileSystemException
     return op;
   }
 
-  public AbfsRestOperation createPath(final String path, final boolean isFile, final boolean
overwrite,
-                                      final String permission, final String umask,
-                                      final boolean isAppendBlob) throws AzureBlobFileSystemException
{
+  public AbfsRestOperation createPath(final String path,
+      final boolean isFile,
+      final boolean overwrite,
+      final String permission,
+      final String umask,
+      final boolean isAppendBlob) throws AzureBlobFileSystemException {
+    String operation = isFile
+        ? SASTokenProvider.CREATE_FILE_OPERATION
+        : SASTokenProvider.CREATE_DIRECTORY_OPERATION;
+
+    // if "fs.azure.enable.conditional.create.overwrite" is enabled,
+    // trigger a create with overwrite=false first so that eTag fetch can be
+    // avoided for cases when no pre-existing file is present (which is the
+    // case with most part of create traffic)
+    boolean isFirstAttemptToCreateWithoutOverwrite = false;
+    if (isFile && overwrite
+        && abfsConfiguration.isConditionalCreateOverwriteEnabled()) {
+      isFirstAttemptToCreateWithoutOverwrite = true;
+    }
+
+    AbfsRestOperation op = null;
+    // Query builder
+    final AbfsUriQueryBuilder abfsUriQueryBuilder = createDefaultUriQueryBuilder();
+    abfsUriQueryBuilder.addQuery(QUERY_PARAM_RESOURCE,
+        operation.equals(SASTokenProvider.CREATE_FILE_OPERATION)
+            ? FILE
+            : DIRECTORY);
+
+    if (isAppendBlob) {
+      abfsUriQueryBuilder.addQuery(QUERY_PARAM_BLOBTYPE, APPEND_BLOB_TYPE);
+    }
+
+    appendSASTokenToQuery(path, operation, abfsUriQueryBuilder);
+
+    try {
+      op = createPathImpl(path, abfsUriQueryBuilder,
+          (isFirstAttemptToCreateWithoutOverwrite ? false : overwrite),
+          permission, umask, null);
+    } catch (AbfsRestOperationException e) {
+      if ((e.getStatusCode() == HttpURLConnection.HTTP_CONFLICT)
+          && isFirstAttemptToCreateWithoutOverwrite) {
+        // Was the first attempt made to create file without overwrite which
+        // failed because there is a pre-existing file.
+        // resetting the first attempt flag for readabiltiy
+        isFirstAttemptToCreateWithoutOverwrite = false;
+
+        // Fetch eTag
+        try {
+          op = getPathStatus(path, false);
+        } catch (AbfsRestOperationException ex) {
+          if (ex.getStatusCode() == HttpURLConnection.HTTP_NOT_FOUND) {
+            // Is a parallel access case, as file which was found to be
+            // present went missing by this request.
+            throw new ConcurrentWriteOperationDetectedException(
+                "Parallel access to the create path detected. Failing request "
+                    + "to honor single writer semantics");
+          } else {
+            throw ex;
+          }
+        }

Review comment:
       The original design was such that AbfsClient is a thin client over the REST API, and
AzureBlobFileSystemStore is where the app logic lived, such as handling continuation tokens
or making multiple calls such as getting the etag to use it in a conditional request.  I think
we should stick to the original design and move this fancy logic to AzureBlobFileSystemStore
and expose an option on AbfsClient to take an optional condition (the etag) when creating
a file.  In this way, the update to the AbfsClient would be a single line to add the optional
"If-Match: E-Tag" request header, but there would be a new method in AzureBlobFileSystemStore
that implements the new conditional overwrite behavior.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message