hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDDS-1700) RPC Payload too large on datanode startup in kubernetes
Date Thu, 20 Jun 2019 21:38:00 GMT

    [ https://issues.apache.org/jira/browse/HDDS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16868948#comment-16868948
] 

Anu Engineer commented on HDDS-1700:
------------------------------------

Thank you for root causing this issue.  Appreciate the effort. It would be nice if you can
vote when the 0.4.1 release comes up since you would have already tested the yet-to-release
the k8s packages.

> RPC Payload too large on datanode startup in kubernetes
> -------------------------------------------------------
>
>                 Key: HDDS-1700
>                 URL: https://issues.apache.org/jira/browse/HDDS-1700
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: docker, Ozone Datanode, SCM
>    Affects Versions: 0.4.0
>         Environment: datanode pod's ozone-site.xml
> {code:java}
> <configuration>
> <property><name>ozone.scm.block.client.address</name><value>ozone-managers-service:9876</value></property>
> <property><name>ozone.enabled</name><value>True</value></property>
> <property><name>ozone.scm.datanode.id</name><value>/tmp/datanode.id</value></property>
> <property><name>ozone.scm.client.address</name><value>ozone-managers-service:9876</value></property>
> <property><name>ozone.metadata.dirs</name><value>/tmp/metadata</value></property>
> <property><name>ozone.scm.names</name><value>ozone-managers-service:9876</value></property>
> <property><name>ozone.om.address</name><value>ozone-managers-service:9874</value></property>
> <property><name>ozone.handler.type</name><value>distributed</value></property>
> <property><name>ozone.scm.datanode.address</name><value>ozone-managers-service:9876</value></property>
> </configuration>
> {code}
> OM/SCM pod's ozone-site.xml
> {code:java}
> <configuration>
> <property><name>ozone.scm.block.client.address</name><value>localhost</value></property>
> <property><name>ozone.enabled</name><value>True</value></property>
> <property><name>ozone.scm.datanode.id</name><value>/tmp/datanode.id</value></property>
> <property><name>ozone.scm.client.address</name><value>localhost</value></property>
> <property><name>ozone.metadata.dirs</name><value>/tmp/metadata</value></property>
> <property><name>ozone.scm.names</name><value>localhost</value></property>
> <property><name>ozone.om.address</name><value>localhost</value></property>
> <property><name>ozone.handler.type</name><value>distributed</value></property>
> <property><name>ozone.scm.datanode.address</name><value>localhost</value></property>
> </configuration>
> {code}
>  
>  
>            Reporter: Josh Siegel
>            Priority: Minor
>
> When starting the datanode on a seperate kubernetes pod than the SCM and OM, the below
error appears in the datanode's {{ozone.log}}. We verified basic connectivity between the
datanode pod and the OM/SCM pod.
> {code:java}
> 2019-06-17 17:14:16,449 [Datanode State Machine Thread - 0] ERROR (EndpointStateMachine.java:207)
- Unable to communicate to SCM server at ozone-managers-service:9876 for past 31800 seconds.
> java.io.IOException: Failed on local exception: org.apache.hadoop.ipc.RpcException: RPC
response exceeds maximum data length; Host Details : local host is: "ozone-datanode/10.244.84.187";
destination host is: "ozone-managers-service":9876;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:816)
> at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
> at org.apache.hadoop.ipc.Client.call(Client.java:1457)
> at org.apache.hadoop.ipc.Client.call(Client.java:1367)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> at com.sun.proxy.$Proxy88.getVersion(Unknown Source)
> at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolClientSideTranslatorPB.getVersion(StorageContainerDatanodeProtocolClientSideTranslatorPB.java:112)
> at org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:70)
> at org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length
> at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1830)
> at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1173)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1069){code}
>  
> cc [~slamendola2_bloomberg]
> [~anu]
> [~elek]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message