hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] (HADOOP-14028) S3A block output streams don't clear temporary files
Date Tue, 31 Jan 2017 16:40:52 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847095#comment-15847095
] 

Steve Loughran commented on HADOOP-14028:
-----------------------------------------

log of the multple disk block put, no stream closed events, 
{code}
2017-01-31 16:36:50,459 [JUnit-test_010_CreateHugeFile] INFO  scale.AbstractSTestS3AHugeFiles
(AbstractSTestS3AHugeFiles.java:test_010_CreateHugeFile(188)) - [100%] Buffered 10.00 MB out
of 10 MB; PUT 0 bytes (8388608 pending) in 2 operations (1 active); elapsedTime=0.21s; write
to buffer bandwidth=48.40 MB/s
2017-01-31 16:36:50,459 [JUnit-test_010_CreateHugeFile] INFO  scale.AbstractSTestS3AHugeFiles
(AbstractSTestS3AHugeFiles.java:test_010_CreateHugeFile(203)) - Closing stream org.apache.hadoop.fs.FSDataOutputStream@7067fa8a
2017-01-31 16:36:50,460 [JUnit-test_010_CreateHugeFile] INFO  scale.AbstractSTestS3AHugeFiles
(AbstractSTestS3AHugeFiles.java:test_010_CreateHugeFile(204)) - Statistics : OutputStreamStatistics{blocksSubmitted=1,
blocksInQueue=0, blocksActive=1, blockUploadsCompleted=0, blockUploadsFailed=0, bytesPendingUpload=8388608,
bytesUploaded=0, blocksAllocated=2, blocksReleased=0, blocksActivelyAllocated=2, exceptionsInMultipartFinalize=0,
transferDuration=0 ms, queueDuration=3 ms, averageQueueTime=3 ms, totalUploadDuration=3 ms,
effectiveBandwidth=0.0 bytes/s}
2017-01-31 16:36:50,460 [JUnit-test_010_CreateHugeFile] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:close(329))
- S3ABlockOutputStream{{bucket=hwdev-steve-ireland-new, key='tests3ascale/scale/hugefile'},
blockSize=8388608, activeBlock=FileBlock{index=2, destFile=target/build/test/s3a/s3ablock-0002-2667023026802684004.tmp,
state=Writing, dataSize=2097152, limit=8388608}}: Closing block #2: current block= FileBlock{index=2,
destFile=target/build/test/s3a/s3ablock-0002-2667023026802684004.tmp, state=Writing, dataSize=2097152,
limit=8388608}
2017-01-31 16:36:50,460 [JUnit-test_010_CreateHugeFile] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:uploadCurrentBlock(298))
- Writing block # 2
2017-01-31 16:36:50,460 [JUnit-test_010_CreateHugeFile] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:uploadBlockAsync(469))
- Queueing upload of FileBlock{index=2, destFile=target/build/test/s3a/s3ablock-0002-2667023026802684004.tmp,
state=Writing, dataSize=2097152, limit=8388608}
2017-01-31 16:36:50,460 [JUnit-test_010_CreateHugeFile] DEBUG s3a.S3ADataBlocks (S3ADataBlocks.java:startUpload(260))
- Start datablock[2] upload
2017-01-31 16:36:50,460 [JUnit-test_010_CreateHugeFile] DEBUG s3a.S3ADataBlocks (S3ADataBlocks.java:enterState(167))
- FileBlock{index=2, destFile=target/build/test/s3a/s3ablock-0002-2667023026802684004.tmp,
state=Writing, dataSize=2097152, limit=8388608}: entering state Upload
2017-01-31 16:36:50,461 [JUnit-test_010_CreateHugeFile] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:clearActiveBlock(209))
- Clearing active block
2017-01-31 16:36:50,461 [s3a-transfer-shared-pool1-t3] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:call(490))
- Uploading part 2 for id 'i8AV2cnBFI5JymBdObYinUZh9Fl.JF.DN9w91B9UtOq08Y02Jj1W3uNGyBYLUg5VmNscz3xzMr5naAh6R.O.Ew--'
2017-01-31 16:36:50,461 [JUnit-test_010_CreateHugeFile] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:waitForAllPartUploads(511))
- Waiting for 2 uploads to complete
2017-01-31 16:36:50,462 [s3a-transfer-shared-pool1-t3] DEBUG amazonaws.request (AmazonHttpClient.java:executeOneRequest(1046))
- Sending Request: PUT https://hwdev-steve-ireland-new.s3-eu-west-1.amazonaws.com /tests3ascale/scale/hugefile
Parameters: ({"uploadId":["i8AV2cnBFI5JymBdObYinUZh9Fl.JF.DN9w91B9UtOq08Y02Jj1W3uNGyBYLUg5VmNscz3xzMr5naAh6R.O.Ew--"],"partNumber":["2"]}Headers:
(User-Agent: STestS3AHugeFileCreate, Hadoop 2.9.0-SNAPSHOT, aws-sdk-java/1.11.45 Mac_OS_X/10.12.2
Java_HotSpot(TM)_64-Bit_Server_VM/25.102-b14/1.8.0_102, amz-sdk-invocation-id: 94996caa-80e5-a609-d1e8-edf313bc06a2,
Content-Length: 2097152, Content-Type: application/octet-stream, ) 
2017-01-31 16:36:50,462 [s3a-transfer-shared-pool1-t3] DEBUG auth.AWS4Signer (CommonsLog.java:debug(33))
- AWS4 Canonical Request: '"PUT
/tests3ascale/scale/hugefile
partNumber=2&uploadId=i8AV2cnBFI5JymBdObYinUZh9Fl.JF.DN9w91B9UtOq08Y02Jj1W3uNGyBYLUg5VmNscz3xzMr5naAh6R.O.Ew--
amz-sdk-invocation-id:94996caa-80e5-a609-d1e8-edf313bc06a2
amz-sdk-retry:0/0/500
content-length:2097152
content-type:application/octet-stream
host:hwdev-steve-ireland-new.s3-eu-west-1.amazonaws.com
user-agent:STestS3AHugeFileCreate, Hadoop 2.9.0-SNAPSHOT, aws-sdk-java/1.11.45 Mac_OS_X/10.12.2
Java_HotSpot(TM)_64-Bit_Server_VM/25.102-b14/1.8.0_102
x-amz-content-sha256:UNSIGNED-PAYLOAD
x-amz-date:20170131T163650Z

amz-sdk-invocation-id;amz-sdk-retry;content-length;content-type;host;user-agent;x-amz-content-sha256;x-amz-date
UNSIGNED-PAYLOAD"
2017-01-31 16:36:50,463 [s3a-transfer-shared-pool1-t3] DEBUG auth.AWS4Signer (CommonsLog.java:debug(33))
- AWS4 String to Sign: '"AWS4-HMAC-SHA256
20170131T163650Z
20170131/eu-west-1/s3/aws4_request
df85cf8cefd073de4a259d0eb4a5d22242ede33b3b402ccf73e7a43c81ac9f31"
2017-01-31 16:36:53,370 [s3a-transfer-shared-pool1-t3] DEBUG amazonaws.request (AwsResponseHandlerAdapter.java:handle(87))
- Received successful response: 200, AWS Request ID: 1B4A1673E51E1FE9
2017-01-31 16:36:53,370 [s3a-transfer-shared-pool1-t3] DEBUG amazonaws.requestId (AwsResponseHandlerAdapter.java:logHeaderRequestId(136))
- x-amzn-RequestId: not available
2017-01-31 16:36:53,370 [s3a-transfer-shared-pool1-t3] DEBUG amazonaws.requestId (AwsResponseHandlerAdapter.java:logResponseRequestId(156))
- AWS Request ID: 1B4A1673E51E1FE9
2017-01-31 16:36:53,372 [s3a-transfer-shared-pool1-t3] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:call(494))
- Completed upload of FileBlock{index=2, destFile=target/build/test/s3a/s3ablock-0002-2667023026802684004.tmp,
state=Upload, dataSize=2097152, limit=8388608}
2017-01-31 16:36:53,372 [s3a-transfer-shared-pool1-t3] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:call(495))
- Stream statistics of OutputStreamStatistics{blocksSubmitted=2, blocksInQueue=0, blocksActive=1,
blockUploadsCompleted=1, blockUploadsFailed=0, bytesPendingUpload=5242880, bytesUploaded=5242880,
blocksAllocated=2, blocksReleased=0, blocksActivelyAllocated=2, exceptionsInMultipartFinalize=0,
transferDuration=2909 ms, queueDuration=4 ms, averageQueueTime=2 ms, totalUploadDuration=2913
ms, effectiveBandwidth=1799821.4898729832 bytes/s}
2017-01-31 16:36:53,372 [s3a-transfer-shared-pool1-t3] DEBUG s3a.S3ADataBlocks (S3ADataBlocks.java:enterState(167))
- FileBlock{index=2, destFile=target/build/test/s3a/s3ablock-0002-2667023026802684004.tmp,
state=Upload, dataSize=2097152, limit=8388608}: entering state Closed
2017-01-31 16:36:53,372 [s3a-transfer-shared-pool1-t3] DEBUG s3a.S3ADataBlocks (S3ADataBlocks.java:close(282))
- Closed FileBlock{index=2, destFile=target/build/test/s3a/s3ablock-0002-2667023026802684004.tmp,
state=Closed, dataSize=2097152, limit=8388608}
2017-01-31 16:36:53,372 [s3a-transfer-shared-pool1-t3] DEBUG s3a.S3ADataBlocks (S3ADataBlocks.java:innerClose(811))
- Closing FileBlock{index=2, destFile=target/build/test/s3a/s3ablock-0002-2667023026802684004.tmp,
state=Closed, dataSize=2097152, limit=8388608}
2017-01-31 16:36:57,131 [s3a-transfer-shared-pool1-t2] DEBUG amazonaws.request (AwsResponseHandlerAdapter.java:handle(87))
- Received successful response: 200, AWS Request ID: 9D1707B157420F81
2017-01-31 16:36:57,132 [s3a-transfer-shared-pool1-t2] DEBUG amazonaws.requestId (AwsResponseHandlerAdapter.java:logHeaderRequestId(136))
- x-amzn-RequestId: not available
2017-01-31 16:36:57,132 [s3a-transfer-shared-pool1-t2] DEBUG amazonaws.requestId (AwsResponseHandlerAdapter.java:logResponseRequestId(156))
- AWS Request ID: 9D1707B157420F81
2017-01-31 16:36:57,132 [s3a-transfer-shared-pool1-t2] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:call(494))
- Completed upload of FileBlock{index=1, destFile=target/build/test/s3a/s3ablock-0001-8866456036483591474.tmp,
state=Upload, dataSize=8388608, limit=8388608}
2017-01-31 16:36:57,132 [s3a-transfer-shared-pool1-t2] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:call(495))
- Stream statistics of OutputStreamStatistics{blocksSubmitted=2, blocksInQueue=0, blocksActive=0,
blockUploadsCompleted=2, blockUploadsFailed=0, bytesPendingUpload=0, bytesUploaded=10485760,
blocksAllocated=2, blocksReleased=0, blocksActivelyAllocated=2, exceptionsInMultipartFinalize=0,
transferDuration=9587 ms, queueDuration=4 ms, averageQueueTime=2 ms, totalUploadDuration=9591
ms, effectiveBandwidth=1093291.6275675113 bytes/s}
2017-01-31 16:36:57,132 [s3a-transfer-shared-pool1-t2] DEBUG s3a.S3ADataBlocks (S3ADataBlocks.java:enterState(167))
- FileBlock{index=1, destFile=target/build/test/s3a/s3ablock-0001-8866456036483591474.tmp,
state=Upload, dataSize=8388608, limit=8388608}: entering state Closed
2017-01-31 16:36:57,132 [s3a-transfer-shared-pool1-t2] DEBUG s3a.S3ADataBlocks (S3ADataBlocks.java:close(282))
- Closed FileBlock{index=1, destFile=target/build/test/s3a/s3ablock-0001-8866456036483591474.tmp,
state=Closed, dataSize=8388608, limit=8388608}

// ** This is where the owner block is being closed **

2017-01-31 16:36:57,132 [s3a-transfer-shared-pool1-t2] DEBUG s3a.S3ADataBlocks (S3ADataBlocks.java:innerClose(811))
- Closing FileBlock{index=1, destFile=target/build/test/s3a/s3ablock-0001-8866456036483591474.tmp,
state=Closed, dataSize=8388608, limit=8388608}
2017-01-31 16:36:57,133 [JUnit-test_010_CreateHugeFile] DEBUG s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:complete(550))
- Completing multi-part upload for key 'tests3ascale/scale/hugefile', id 'i8AV2cnBFI5JymBdObYinUZh9Fl.JF.DN9w91B9UtOq08Y02Jj1W3uNGyBYLUg5VmNscz3xzMr5naAh6R.O.Ew--'
with 2 partitions 
2017-01-31 16:36:57,135 [JUnit-test_010_CreateHugeFile] DEBUG amazonaws.request (AmazonHttpClient.java:executeOneRequest(1046))
- Sending Request: POST https://hwdev-steve-ireland-new.s3-eu-west-1.amazonaws.com /tests3ascale/scale/hugefile
Parameters: ({"uploadId":["i8AV2cnBFI5JymBdObYinUZh9Fl.JF.DN9w91B9UtOq08Y02Jj1W3uNGyBYLUg5VmNscz3xzMr5naAh6R.O.Ew--"]}Headers:
(User-Agent: STestS3AHugeFileCreate, Hadoop 2.9.0-SNAPSHOT, aws-sdk-java/1.11.45 Mac_OS_X/10.12.2
Java_HotSpot(TM)_64-Bit_Server_VM/25.102-b14/1.8.0_102, amz-sdk-invocation-id: aa9cda11-857e-fbe5-86b4-05640be1e7b0,
Content-Length: 219, Content-Type: application/xml, ) 
2017-01-31 16:36:57,136 [JUnit-test_010_CreateHugeFile] DEBUG auth.AWS4Signer (CommonsLog.java:debug(33))
- AWS4 Canonical Request: '"POST
/tests3ascale/scale/hugefile
{code}

> S3A block output streams don't clear temporary files
> ----------------------------------------------------
>
>                 Key: HADOOP-14028
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14028
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.0.0-alpha2
>         Environment: JDK 8 + ORC 1.3.0 + hadoop-aws 3.0.0-alpha2
>            Reporter: Seth Fitzsimmons
>            Assignee: Steve Loughran
>
> I have `fs.s3a.fast.upload` enabled with 3.0.0-alpha2 (it's exactly what I was looking
for after running into the same OOM problems) and don't see it cleaning up the disk-cached
blocks.
> I'm generating a ~50GB file on an instance with ~6GB free when the process starts. My
expectation is that local copies of the blocks would be deleted after those parts finish uploading,
but I'm seeing more than 15 blocks in /tmp (and none of them have been deleted thus far).
> I see that DiskBlock deletes temporary files when closed, but is it closed after individual
blocks have finished uploading or when the entire file has been fully written to the FS (full
upload completed, including all parts)?
> As a temporary workaround to avoid running out of space, I'm listing files, sorting by
atime, and deleting anything older than the first 20: `ls -ut | tail -n +21 | xargs rm`
> Steve Loughran says:
> > They should be deleted as soon as the upload completes; the close() call that the
AWS httpclient makes on the input stream triggers the deletion. Though there aren't tests
for it, as I recall.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message