hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13560) S3ABlockOutputStream to support huge (many GB) file writes
Date Tue, 04 Oct 2016 20:10:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546509#comment-15546509
] 

Steve Loughran commented on HADOOP-13560:
-----------------------------------------

think I've addressed most of them. Realised that commenting in github makes replying to them
one by one easily.

anyway, to most of the anwers are "yes, you are write, fixed"

one exception, why {{ should DiskBlock#startUpload wrap the returned stream in a BufferedInputStream)}}.


no, because if there is a transient PUT failure and AWS tries to replay the PUT, it can't
roll back the buffered input stream back to the beginning of the file.; only back the beginning
of the last buffered area. This is something that surfaced in the huge file tests.

> S3ABlockOutputStream to support huge (many GB) file writes
> ----------------------------------------------------------
>
>                 Key: HADOOP-13560
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13560
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.9.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>         Attachments: HADOOP-13560-branch-2-001.patch, HADOOP-13560-branch-2-002.patch,
HADOOP-13560-branch-2-003.patch, HADOOP-13560-branch-2-004.patch
>
>
> An AWS SDK [issue|https://github.com/aws/aws-sdk-java/issues/367] highlights that metadata
isn't copied on large copies.
> 1. Add a test to do that large copy/rname and verify that the copy really works
> 2. Verify that metadata makes it over.
> Verifying large file rename is important on its own, as it is needed for very large commit
operations for committers using rename



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message