hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HDDS-2107) Datanodes should retry forever to connect to SCM in an unsecure environment
Date Mon, 16 Sep 2019 19:59:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-2107?focusedWorklogId=313266&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-313266
]

ASF GitHub Bot logged work on HDDS-2107:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Sep/19 19:58
            Start Date: 16/Sep/19 19:58
    Worklog Time Spent: 10m 
      Work Description: hanishakoneru commented on pull request #1424: HDDS-2107. Datanodes
should retry forever to connect to SCM in an…
URL: https://github.com/apache/hadoop/pull/1424
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 313266)
    Time Spent: 1.5h  (was: 1h 20m)

> Datanodes should retry forever to connect to SCM in an unsecure environment
> ---------------------------------------------------------------------------
>
>                 Key: HDDS-2107
>                 URL: https://issues.apache.org/jira/browse/HDDS-2107
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>    Affects Versions: 0.4.1
>            Reporter: Vivek Ratnavel Subramanian
>            Assignee: Vivek Ratnavel Subramanian
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In an unsecure environment, the datanodes try upto 10 times after waiting for 1000 milliseconds
each time before throwing this error:
> {code:java}
> Unable to communicate to SCM server at scm:9861 for past 0 seconds.
> java.net.ConnectException: Call From scm/10.65.36.118 to scm:9861 failed on connection
exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> 	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
> 	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:755)
> 	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1457)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1367)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> 	at com.sun.proxy.$Proxy33.getVersion(Unknown Source)
> 	at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolClientSideTranslatorPB.getVersion(StorageContainerDatanodeProtocolClientSideTranslatorPB.java:112)
> 	at org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:70)
> 	at org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> 	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> 	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
> 	at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
> 	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
> 	at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1403)
> 	... 13 more
> {code}
> The datanodes should try forever to connect with SCM and not throw any errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message