hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mukul Kumar Singh (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDDS-1057) get key operation fails when client cannot communicate with 2 of the datanodes in 3 node cluster
Date Mon, 11 Feb 2019 05:49:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mukul Kumar Singh resolved HDDS-1057.
-------------------------------------
    Resolution: Not A Problem

Resolving this as not a problem as the key was assumed to be created using 3 node ratis, however
here it was created using 1 node ratis.

> get key operation fails when client cannot communicate with 2 of the datanodes in 3 node
cluster
> ------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-1057
>                 URL: https://issues.apache.org/jira/browse/HDDS-1057
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Client
>            Reporter: Nilotpal Nandi
>            Assignee: Shashikant Banerjee
>            Priority: Major
>         Attachments: test_client_failure_isolate_two_datanodes_all_docker.log
>
>
> steps taken :
> ------------------
>  # created 3 node docker cluster.
>  # wrote a key
>  # created partition such that 2 out of 3 datanodes cannot communicate with any other
node.
>  # Third datanode can communicate with scm, om and the client.
>  # Tried to read the key
> Exception seen :
> ------------------------
>  
> {noformat}
> Failed to execute command cmdType: GetBlock
> E traceID: "9b3ebd93-e598-4ca2-a6f4-2389f2d35f63"
> E containerID: 22
> E datanodeUuid: "15345663-15c9-4fe3-9b8f-a46123ba8a6e"
> E getBlock {
> E blockID {
> E containerID: 22
> E localID: 101545011736215553
> E blockCommitSequenceId: 5
> E }
> E }
> E on datanode 15345663-15c9-4fe3-9b8f-a46123ba8a6e
> E java.util.concurrent.ExecutionException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException:
UNAVAILABLE: io exception
> E at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> E at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> E at org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:220)
> E at org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommand(XceiverClientGrpc.java:201)
> E at org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.getBlock(ContainerProtocolCalls.java:118)
> E at org.apache.hadoop.ozone.client.io.KeyInputStream.getFromOmKeyInfo(KeyInputStream.java:305)
> E at org.apache.hadoop.ozone.client.rpc.RpcClient.getKey(RpcClient.java:608)
> E at org.apache.hadoop.ozone.client.OzoneBucket.readKey(OzoneBucket.java:284)
> E at org.apache.hadoop.ozone.web.ozShell.keys.GetKeyHandler.call(GetKeyHandler.java:95)
> E at org.apache.hadoop.ozone.web.ozShell.keys.GetKeyHandler.call(GetKeyHandler.java:48)
> E at picocli.CommandLine.execute(CommandLine.java:919)
> E at picocli.CommandLine.access$700(CommandLine.java:104)
> E at picocli.CommandLine$RunLast.handle(CommandLine.java:1083)
> E at picocli.CommandLine$RunLast.handle(CommandLine.java:1051)
> E at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959)
> E at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242)
> E at picocli.CommandLine.parseWithHandler(CommandLine.java:1181)
> E at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:61)
> E at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:52)
> E at org.apache.hadoop.ozone.web.ozShell.Shell.main(Shell.java:83)
> E Caused by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE:
io exception
> E at org.apache.ratis.thirdparty.io.grpc.Status.asRuntimeException(Status.java:526)
> E at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:434)
> E at org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
> E at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
> E at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
> E at org.apache.ratis.thirdparty.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:678)
> E at org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
> E at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
> E at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
> E at org.apache.ratis.thirdparty.io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:397)
> E at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459)
> E at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63)
> E at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:546)
> E at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:467)
> E at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584)
> E at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
> E at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> E at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> E at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> E at java.lang.Thread.run(Thread.java:748)
> E Caused by: org.apache.ratis.thirdparty.io.netty.channel.ConnectTimeoutException: connection
timed out: /172.20.0.7:9859
> E at org.apache.ratis.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:267)
> E at org.apache.ratis.thirdparty.io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
> E at org.apache.ratis.thirdparty.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:127)
> E at org.apache.ratis.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
> E at org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
> E at org.apache.ratis.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
> E at org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
> E at org.apache.ratis.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> E ... 1 more
> E Failed to execute command cmdType: GetBlock]
>  
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message