hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13912) S3a Multipart Committer (avoid rename)
Date Tue, 03 Jan 2017 13:38:58 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Steve Loughran updated HADOOP-13912:
------------------------------------
    Issue Type: New Feature  (was: Sub-task)
        Parent:     (was: HADOOP-13204)

> S3a Multipart Committer (avoid rename)
> --------------------------------------
>
>                 Key: HADOOP-13912
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13912
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>            Reporter: Thomas Demoor
>            Assignee: Thomas Demoor
>
> Object stores do not have an efficient rename operation, which is used by the Hadoop
FileOutputCommitter to atomically promote the "winning" attempt out of the multiple (speculative)
attempts to the final path. These slow job commits are one of the main friction points when
using object stores in Hadoop.There have been quite some attempts at resolving this: HADOOP-9565,
Apache Spark DirectOutputCommitters, ... but they have proven not to be robust in face of
adversity (network partitions, ...).
> The current ticket proposes to do the atomic commit by using the S3 Multipart API, which
allows multiple concurrent uploads on the same objectname, each in its own "temporary space,
identified by the UploadId which is returned as a response to InitiateMultipartUpload. Every
attempt writes directly to the final {{outputPath}}. Data is uploaded using Put Part and as
a response an ETag for the part is returned and stored. The CompleteMultipartUpload is postponed.
Instead, we persist the UploadId (using a _temporary subdir or elsewhere) and the ETags. When
a certain "job" wins {{CompleteMultipartUpload}} is called for each of its files using the
proper list of Part ETags. 
> Completing a MultipartUpload is a metadata only operation (internally in S3) and is thus
orders of magnitude faster than the rename-based approach which moves all the data. 
> Required work: 
> * Expose the multipart initiate and complete calls in S3AOutputStream to S3AFilesystem

> * Use these multipart calls in a custom committer as described above. I propose to build
on the S3ACommitter [~stevel@apache.org] is doing for HADOOP-13786



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message