Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Tue, 29 Nov 2016 19:51:58 +0000 (UTC)
From: "Steve Loughran (JIRA)" <jira@apache.org>
To: common-issues@hadoop.apache.org
Message-ID: <JIRA.13017346.1478110604000.393327.1480449118662@Atlassian.JIRA>
In-Reply-To: <JIRA.13017346.1478110604000@Atlassian.JIRA>
References: <JIRA.13017346.1478110604000@Atlassian.JIRA> <JIRA.13017346.1478110604662@arcas>
Subject: [jira] [Commented] (HADOOP-13786) add output committer which uses
 s3guard for consistent O(1) commits to S3
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Tue, 29 Nov 2016 19:52:00 -0000


    [ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706351#comment-15706351 ] 

Steve Loughran commented on HADOOP-13786:
-----------------------------------------

For the curious, my WiP, in the s3guard branch, is here: https://github.com/steveloughran/hadoop/tree/s3guard/HADOOP-13786-committer

I've been tweaking the mapreduce code to add ability to switch to a subclass of FileOutputCommitter in FileOutputFormat (And transitively, all children), by way of a factory. The S3a factory dynamically chooses the committer based on the destination FS. There's currently no difference in the codebase apart from logging of operation durations. This means that we can switch the committer behind all output formats to using the s3a one (and also allowing anyone else to do the same with their own FOF committer subclass). 

I can see that ParquetOutputFormat has its own committer, {{ParquetOutputCommitter}}, as used by {{ParquetOutputFormat}} and so, I believe spark's {{dataframe.parquet.write()}} pipeline; my IDE isn't highlighting much else.

Now, could I get away with just modifying the base FOF committer?


for exception reporting it could use {{FileSystem.rename(final Path src, final Path dst, final Rename... options)}}; S3A to impl that with exceptions. That's the "transitory" method for use between FS and FileContext, which does raise exceptions; the default is non-atomic and eventually gets to {{rename(src, dest)}}, except for HDFS which does a full implementation. But: what if the semantics of that rename() (not yet in the FS spec, AFAIK) are different from what callers expect in some of the corner cases? And so commits everywhere break? That would not be good. And it would remove the ability to use any private s3a/s3guard calls if we wanted some (e.g. get a lock on a directory), unless they went in to the standard FS APIs.

Making the output factory pluggable would avoid such problems, though logging duration might be nice all round. That said, given the time to rename, its somewhat unimportant when using real filesystems.


> add output committer which uses s3guard for consistent O(1) commits to S3
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>
> A goal of this code is "support O(1) commits to S3 repositories in the presence of failures". Implement it, including whatever is needed to demonstrate the correctness of the algorithm. (that is, assuming that s3guard provides a consistent view of the presence/absence of blobs, show that we can commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output streams (ie. not visible until the close()), if we need to use that to allow us to abort commit operations.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org