hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13560) S3A to support huge file writes and operations -with tests
Date Thu, 08 Sep 2016 12:42:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15473767#comment-15473767

Steve Loughran commented on HADOOP-13560:

New code adds stats too; here is the tail end of an 80 MB upload 
Statistics: OutputStreamStatistics{blocksSubmitted=400, blocksInQueue=0, blocksActive=0, blockUploadsCompleted=400,
blockUploadsFailed=0, bytesPendingUpload=0, bytesUploaded=2097152000, transferDuration=13308123
ms, queueDuration=26935221 ms, averageQueueTime=67338 ms, totalUploadDuration=40243344 ms,
effectiveBandwidth=52111.77281887907 bytes/s}

that bandwith measure is low as includes queue time, and there are many blocks from the same
stream then the aggregate queue time is pretty high. The real B/W here is :
2016-09-08 13:35:52,185 [JUnit] INFO  scale.AbstractSTestS3AHugeFiles (AbstractSTestS3AHugeFiles.java:test_010_CreateHugeFile(157))
- Time per MB to write = 669,477,018 nS
2016-09-08 13:35:52,186 [JUnit] INFO  scale.AbstractSTestS3AHugeFiles (AbstractSTestS3AHugeFiles.java:test_010_CreateHugeFile(159))
- Effective Bandwidth: 1.5662613821040523 MB/s
2016-09-08 13:35:52,186 [JUnit] INFO  scale.AbstractSTestS3AHugeFiles (AbstractSTestS3AHugeFiles.java:test_010_CreateHugeFile(162))
- PUT 2097152000 bytes in 400 operations; 5 MB/operation
2016-09-08 13:35:52,186 [JUnit] INFO  scale.AbstractSTestS3AHugeFiles (AbstractSTestS3AHugeFiles.java:test_010_CreateHugeFile(165))
- Time per PUT 3,347,385,091 nS
That queue duration also includes time that the thread generating the output is blocked awaiting
submission of work. As that submission is happening in a sync block, I worry that this blocking
will make this output stream  (and the FastOutputStream) something that can't be interrupted
easily. Does that matter?

> S3A to support huge file writes and operations -with tests
> ----------------------------------------------------------
>                 Key: HADOOP-13560
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13560
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.9.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>         Attachments: HADOOP-13560-branch-2-001.patch, HADOOP-13560-branch-2-002.patch
> An AWS SDK [issue|https://github.com/aws/aws-sdk-java/issues/367] highlights that metadata
isn't copied on large copies.
> 1. Add a test to do that large copy/rname and verify that the copy really works
> 2. Verify that metadata makes it over.
> Verifying large file rename is important on its own, as it is needed for very large commit
operations for committers using rename

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message