hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDDS-1700) RPC Payload too large on datanode startup in kubernetes
Date Thu, 20 Jun 2019 21:38:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Anu Engineer resolved HDDS-1700.
--------------------------------
    Resolution: Cannot Reproduce

> RPC Payload too large on datanode startup in kubernetes
> -------------------------------------------------------
>
>                 Key: HDDS-1700
>                 URL: https://issues.apache.org/jira/browse/HDDS-1700
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: docker, Ozone Datanode, SCM
>    Affects Versions: 0.4.0
>         Environment: datanode pod's ozone-site.xml
> {code:java}
> <configuration>
> <property><name>ozone.scm.block.client.address</name><value>ozone-managers-service:9876</value></property>
> <property><name>ozone.enabled</name><value>True</value></property>
> <property><name>ozone.scm.datanode.id</name><value>/tmp/datanode.id</value></property>
> <property><name>ozone.scm.client.address</name><value>ozone-managers-service:9876</value></property>
> <property><name>ozone.metadata.dirs</name><value>/tmp/metadata</value></property>
> <property><name>ozone.scm.names</name><value>ozone-managers-service:9876</value></property>
> <property><name>ozone.om.address</name><value>ozone-managers-service:9874</value></property>
> <property><name>ozone.handler.type</name><value>distributed</value></property>
> <property><name>ozone.scm.datanode.address</name><value>ozone-managers-service:9876</value></property>
> </configuration>
> {code}
> OM/SCM pod's ozone-site.xml
> {code:java}
> <configuration>
> <property><name>ozone.scm.block.client.address</name><value>localhost</value></property>
> <property><name>ozone.enabled</name><value>True</value></property>
> <property><name>ozone.scm.datanode.id</name><value>/tmp/datanode.id</value></property>
> <property><name>ozone.scm.client.address</name><value>localhost</value></property>
> <property><name>ozone.metadata.dirs</name><value>/tmp/metadata</value></property>
> <property><name>ozone.scm.names</name><value>localhost</value></property>
> <property><name>ozone.om.address</name><value>localhost</value></property>
> <property><name>ozone.handler.type</name><value>distributed</value></property>
> <property><name>ozone.scm.datanode.address</name><value>localhost</value></property>
> </configuration>
> {code}
>  
>  
>            Reporter: Josh Siegel
>            Priority: Minor
>
> When starting the datanode on a seperate kubernetes pod than the SCM and OM, the below
error appears in the datanode's {{ozone.log}}. We verified basic connectivity between the
datanode pod and the OM/SCM pod.
> {code:java}
> 2019-06-17 17:14:16,449 [Datanode State Machine Thread - 0] ERROR (EndpointStateMachine.java:207)
- Unable to communicate to SCM server at ozone-managers-service:9876 for past 31800 seconds.
> java.io.IOException: Failed on local exception: org.apache.hadoop.ipc.RpcException: RPC
response exceeds maximum data length; Host Details : local host is: "ozone-datanode/10.244.84.187";
destination host is: "ozone-managers-service":9876;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:816)
> at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
> at org.apache.hadoop.ipc.Client.call(Client.java:1457)
> at org.apache.hadoop.ipc.Client.call(Client.java:1367)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> at com.sun.proxy.$Proxy88.getVersion(Unknown Source)
> at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolClientSideTranslatorPB.getVersion(StorageContainerDatanodeProtocolClientSideTranslatorPB.java:112)
> at org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:70)
> at org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length
> at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1830)
> at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1173)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1069){code}
>  
> cc [~slamendola2_bloomberg]
> [~anu]
> [~elek]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Mime
View raw message