hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajesh Balamohan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14081) S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
Date Fri, 17 Feb 2017 11:27:41 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871711#comment-15871711

Rajesh Balamohan commented on HADOOP-14081:

Thanks [~stevel@apache.org].  Here are the test results (region: S3 bucket in U.S east. Tests
were run from my laptop). Errors are due to socket time outs (180 seconds). Checked ITestS3AContractGetFileStatus.teardown,
which was again due to socket timeout.

Results :

Tests in error:
» SocketTimeout
» PathIO
» SocketTimeout

Tests run: 454, Failures: 0, Errors: 4, Skipped: 56

[INFO] Total time: 02:11 h 

> S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
> --------------------------------------------------------------------------
>                 Key: HADOOP-14081
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14081
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>            Priority: Minor
>         Attachments: HADOOP-14081.001.patch
> In {{S3ADataBlocks::ByteArrayBlock}}, data is copied whenever {{startUpload}} is called.
It might be possible to directly access the byte[] array from ByteArrayOutputStream. 
> Might have to extend ByteArrayOutputStream and create a method like getInputStream()
which can return ByteArrayInputStream.  This would avoid expensive array copy during large

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message