hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mukul Kumar Singh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDDS-1485) Ozone writes fail when single threaded client writes 100MB files repeatedly.
Date Thu, 09 May 2019 09:57:00 GMT

    [ https://issues.apache.org/jira/browse/HDDS-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836252#comment-16836252
] 

Mukul Kumar Singh commented on HDDS-1485:
-----------------------------------------

The problem here is because of the open file limit being reached on the datanode. The container
creation(create directory) failed because of max number of open files and that fails the writeChunk
with ContainerNotFoundException. We should improve the error logging for this error though.



> Ozone writes fail when single threaded client writes 100MB files repeatedly. 
> -----------------------------------------------------------------------------
>
>                 Key: HDDS-1485
>                 URL: https://issues.apache.org/jira/browse/HDDS-1485
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Aravindan Vijayan
>            Assignee: Shashikant Banerjee
>            Priority: Blocker
>
> *Environment*
> 26 node physical cluster.
> All Datanodes are up and running.
> Client attempting to write 1600 x 100MB files using the FsStress utility 
> (https://github.com/arp7/FsPerfTest) fails with the following error. 
> {code}
> 19/05/02 09:58:49 ERROR storage.BlockOutputStream: Unexpected Storage Container Exception:
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: ContainerID
424 does not exist
>         at org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:573)
>         at org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:539)
>         at org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$2(BlockOutputStream.java:616)
>         at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
>         at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
>         at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> It looks like a corruption in the container metadata. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message