hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "wujinhu (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HADOOP-15607) AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream
Date Mon, 23 Jul 2018 08:06:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552455#comment-16552455
] 

wujinhu edited comment on HADOOP-15607 at 7/23/18 8:05 AM:
-----------------------------------------------------------

Thanks [~Sammi] [~uncleGen] for your comments.

As the code shows,  uploadCurrentPart()  executes blockId++ when submit tasks to the thread
pool. If task executes later than the main thread, then the blockId has been changed. 

 

Changing blockFiles type is to fix another problem(we cannot delete files in write(), because
they may be used by upload threads), so we can track upload files by blockId.

 

I have tried to add unit test to reproduce this issue in my mac environment, but failed.

It seems difficult to reproduce this in local environment except when I debug old code sometimes.


was (Author: wujinhu):
Thanks [~Sammi] [~uncleGen] for your comments.

As the code shows,  uploadCurrentPart()  executes blockId++ when submit tasks to the thread
pool. If task executes later than the main thread, then the blockId has been changed. 

 

Changing blockFiles type is to fix another problem(we cannot delete files in write(), because
they may be used by upload threads), so we can track upload files by blockId.

 

I have tried to add unit test to reproduce this issue in my mac environment, but failed.

It seems difficult to reproduce this in local environment only when I debug old code sometimes.

> AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream 
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-15607
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15607
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.10.0, 2.9.1, 3.2.0, 3.1.1, 3.0.3
>            Reporter: wujinhu
>            Assignee: wujinhu
>            Priority: Major
>         Attachments: HADOOP-15607.001.patch, HADOOP-15607.002.patch
>
>
> When I generated data with hive-tpcds tool, I got exception below:
> 2018-07-16 14:50:43,680 INFO mapreduce.Job: Task Id : attempt_1531723399698_0001_m_000052_0,
Status : FAILED
>  Error: com.aliyun.oss.OSSException: The list of parts was not in ascending order. Parts
list must specified in order by part number.
>  [ErrorCode]: InvalidPartOrder
>  [RequestId]: 5B4C40425FCC208D79D1EAF5
>  [HostId]: 100.103.0.137
>  [ResponseError]:
>  <?xml version="1.0" encoding="UTF-8"?>
>  <Error>
>  <Code>InvalidPartOrder</Code>
>  <Message>The list of parts was not in ascending order. Parts list must specified
in order by part number.</Message>
>  <RequestId>5B4C40425FCC208D79D1EAF5</RequestId>
>  <HostId>xx.xx.xx.xx</HostId>
>  <ErrorDetail>current PartNumber 3, you given part number 3is not in ascending
order</ErrorDetail>
>  </Error>
> at com.aliyun.oss.common.utils.ExceptionFactory.createOSSException(ExceptionFactory.java:99)
>  at com.aliyun.oss.internal.OSSErrorResponseHandler.handle(OSSErrorResponseHandler.java:69)
>  at com.aliyun.oss.common.comm.ServiceClient.handleResponse(ServiceClient.java:248)
>  at com.aliyun.oss.common.comm.ServiceClient.sendRequestImpl(ServiceClient.java:130)
>  at com.aliyun.oss.common.comm.ServiceClient.sendRequest(ServiceClient.java:68)
>  at com.aliyun.oss.internal.OSSOperation.send(OSSOperation.java:94)
>  at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:149)
>  at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:113)
>  at com.aliyun.oss.internal.OSSMultipartOperation.completeMultipartUpload(OSSMultipartOperation.java:185)
>  at com.aliyun.oss.OSSClient.completeMultipartUpload(OSSClient.java:790)
>  at org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystemStore.completeMultipartUpload(AliyunOSSFileSystemStore.java:643)
>  at org.apache.hadoop.fs.aliyun.oss.AliyunOSSBlockOutputStream.close(AliyunOSSBlockOutputStream.java:120)
>  at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
>  at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:106)
>  at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.close(MultipleOutputs.java:574)
>  at org.notmysock.tpcds.GenTable$DSDGen.cleanup(GenTable.java:169)
>  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:149)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686)
>  
> I reviewed code below, 
> {code:java}
> blockId {code}
> has thread synchronization problem
> {code:java}
> // code placeholder
> private void uploadCurrentPart() throws IOException {
>   blockFiles.add(blockFile);
>   blockStream.flush();
>   blockStream.close();
>   if (blockId == 0) {
>     uploadId = store.getUploadId(key);
>   }
>   ListenableFuture<PartETag> partETagFuture =
>       executorService.submit(() -> {
>         PartETag partETag = store.uploadPart(blockFile, key, uploadId,
>             blockId + 1);
>         return partETag;
>       });
>   partETagsFutures.add(partETagFuture);
>   blockFile = newBlockFile();
>   blockId++;
>   blockStream = new BufferedOutputStream(new FileOutputStream(blockFile));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message