hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Biju Nair (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-7810) Datanode registration process fails in hadoop 2.6
Date Wed, 18 Feb 2015 19:54:11 GMT

     [ https://issues.apache.org/jira/browse/HDFS-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Biju Nair updated HDFS-7810:
----------------------------
    Description: 
When a new DN is added to the cluster, the registration process fails. The following are the
steps followed.

- Install and start a new DN
- Add entry for the DN in the NN {{/etc/hosts}} file

DN log shows that the registration process failed

- Tried to restart DN with the same result

Since all the DNs have multiple NW interface, we are using the following {{hdfs-site.xml}}
property, instead of listing all the {{dfs.datanode.xx.address}} properties.

{code:xml}
  <property>
    <name>dfs.datanode.dns.interface</name>
    <value>eth2</value>
  </property>
{code}

- Restarting the NN resolves the issue with registration which is not desired. 
- Adding the following {{dfs.datanode.xx.address}} properties seem to resolve DN registration
process without NN restart. But this is a different behavior compared to *hadoop 2.2*. Is
there a reason for the change?

{code:xml}
  <property>
    <name>dfs.datanode.address</name>
    <value>192.168.0.12:50010</value>
  </property>

  <property>
    <name>dfs.datanode.ipc.address</name>
    <value>192.168.0.12:50020</value>
  </property>

  <property>
    <name>dfs.datanode.http.address</name>
    <value>192.168.0.12:50075</value>
  </property>
{code}

*NN Log Error Entry*
{{
2015-02-17 12:21:53,583 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 8020, call
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.100.13:37516
Call#1027 Retry#0 
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication
with namenode because hostname cannot be resolved (ip=192.168.100.13, hostname=192.168.100.13):
DatanodeRegistration(0.0.0.0, datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075,
ipcPort=50020, storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)

at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)

at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)

at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
2015-02-17 12:21:58,607 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
Unresolved datanode registration: hostname cannot be resolved (ip=192.168.100.13, hostname=192.168.100.13)

}}

*DN Log Error Entry*
{{
2015-02-17 12:21:02,994 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1782713777-10.0.100.11-1424188575377
(Datanode Uuid null) service to f-bcpc-vm1/192.168.100.11:8020 beginning handshake with NN

2015-02-17 12:21:03,006 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization
failed for Block pool BP-1782713777-10.0.100.11-1424188575377 (Datanode Uuid null) service
to f-bcpc-vm1/192.168.100.11:8020 Datanode denied communication with namenode because hostname
cannot be resolved (ip=192.168.100.13, hostname=192.168.100.13): DatanodeRegistration(0.0.0.0,
datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075, ipcPort=50020, storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)

at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)

at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)

at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
}}

  was:
When a new DN is added to the cluster, the registration process fails. The following are the
steps followed.

- Install and start a new DN
- Add entry for the DN in the NN {{/etc/hosts}} file

DN log shows that the registration process failed

- Tried to restart DN with the same result

Since all the DNs have multiple NW interface, we are using the following {{hdfs-site.xml}}
property, instead of listing all the {{dfs.datanode.xx.address}} properties.

{code:xml}
  <property>
    <name>dfs.datanode.dns.interface</name>
    <value>eth2</value>
  </property>
{code}

- Restarting the NN resolves the issue with registration which is not desired. 
- Adding the {{dfs.datanode.xx.address}} properties seem to resolve DN registration process
without NN restart. But this is a different behavior compared to *hadoop 2.2*. Is there a
reason for the change?


> Datanode registration process fails in hadoop 2.6 
> --------------------------------------------------
>
>                 Key: HDFS-7810
>                 URL: https://issues.apache.org/jira/browse/HDFS-7810
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.0
>         Environment: ubuntu 12
>            Reporter: Biju Nair
>              Labels: hadoop
>
> When a new DN is added to the cluster, the registration process fails. The following
are the steps followed.
> - Install and start a new DN
> - Add entry for the DN in the NN {{/etc/hosts}} file
> DN log shows that the registration process failed
> - Tried to restart DN with the same result
> Since all the DNs have multiple NW interface, we are using the following {{hdfs-site.xml}}
property, instead of listing all the {{dfs.datanode.xx.address}} properties.
> {code:xml}
>   <property>
>     <name>dfs.datanode.dns.interface</name>
>     <value>eth2</value>
>   </property>
> {code}
> - Restarting the NN resolves the issue with registration which is not desired. 
> - Adding the following {{dfs.datanode.xx.address}} properties seem to resolve DN registration
process without NN restart. But this is a different behavior compared to *hadoop 2.2*. Is
there a reason for the change?
> {code:xml}
>   <property>
>     <name>dfs.datanode.address</name>
>     <value>192.168.0.12:50010</value>
>   </property>
>   <property>
>     <name>dfs.datanode.ipc.address</name>
>     <value>192.168.0.12:50020</value>
>   </property>
>   <property>
>     <name>dfs.datanode.http.address</name>
>     <value>192.168.0.12:50075</value>
>   </property>
> {code}
> *NN Log Error Entry*
> {{
> 2015-02-17 12:21:53,583 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 8020,
call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.100.13:37516
Call#1027 Retry#0 
> org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication
with namenode because hostname cannot be resolved (ip=192.168.100.13, hostname=192.168.100.13):
DatanodeRegistration(0.0.0.0, datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075,
ipcPort=50020, storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)

> at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)

> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)

> at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)

> at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)

> at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)

> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
> 2015-02-17 12:21:58,607 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
Unresolved datanode registration: hostname cannot be resolved (ip=192.168.100.13, hostname=192.168.100.13)

> }}
> *DN Log Error Entry*
> {{
> 2015-02-17 12:21:02,994 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool
BP-1782713777-10.0.100.11-1424188575377 (Datanode Uuid null) service to f-bcpc-vm1/192.168.100.11:8020
beginning handshake with NN 
> 2015-02-17 12:21:03,006 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization
failed for Block pool BP-1782713777-10.0.100.11-1424188575377 (Datanode Uuid null) service
to f-bcpc-vm1/192.168.100.11:8020 Datanode denied communication with namenode because hostname
cannot be resolved (ip=192.168.100.13, hostname=192.168.100.13): DatanodeRegistration(0.0.0.0,
datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075, ipcPort=50020, storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)

> at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)

> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)

> at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)

> at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)

> at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)

> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> }}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message