hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mukul Kumar Singh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDDS-1067) freon run on client gets hung when two of the datanodes are down in 3 datanode cluster
Date Mon, 11 Feb 2019 11:00:00 GMT

    [ https://issues.apache.org/jira/browse/HDDS-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16764860#comment-16764860
] 

Mukul Kumar Singh commented on HDDS-1067:
-----------------------------------------

The block write is waiting for the put block future to finish, on Ratis client timemout, the
entry should be removed from the future list.

{code}
"pool-2-thread-1" #15 prio=5 os_prio=0 tid=0x00007f24a4dea000 nid=0xfb waiting on condition
[0x00007f2483dfe000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000e96cc8b0> (a java.util.concurrent.CompletableFuture$Signaller)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
	at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
	at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
	at org.apache.hadoop.hdds.scm.storage.BlockOutputStream.waitOnFlushFutures(BlockOutputStream.java:517)
	at org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:484)
	at org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:137)
	at org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:489)
	at org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:321)
	at org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:258)
	at org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49)
	at java.io.OutputStream.write(OutputStream.java:75)
	at org.apache.hadoop.ozone.freon.RandomKeyGenerator$OfflineProcessor.run(RandomKeyGenerator.java:603)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
{code}

> freon run on client gets hung when two of the datanodes are down in 3 datanode cluster
> --------------------------------------------------------------------------------------
>
>                 Key: HDDS-1067
>                 URL: https://issues.apache.org/jira/browse/HDDS-1067
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Client
>            Reporter: Nilotpal Nandi
>            Priority: Major
>         Attachments: stack_file.txt
>
>
> steps taken :
> --------------------
>  # created 3 node docker cluster.
>  # wrote a key
>  # created partition such that 2 out of 3 datanodes cannot communicate with any other
node.
>  # Third datanode can communicate with scm, om and the client.
>  # ran freon to write key
> Observation :
> -----------------
> freon run is hung. There is no timeout.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message