hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gagan Brahmi <gaganbra...@gmail.com>
Subject Re: Zookeeper failing to start and i don't know why....
Date Tue, 06 Dec 2016 17:11:28 GMT
Looks like the /etc/hosts file on tc1 isn't configured correctly. Make sure
you have the entry for tc1 as well in hosts file.


Regards,
Gagan Brahmi

On Tue, Dec 6, 2016 at 9:53 AM, Michael Garcia <
Michael.Garcia@alcatelonetouch.com> wrote:

> All,
> I'm having a problem with zookeeper starting up.  It looks like a hostname
> resolution problem.
> I have /etc/hosts configured correctly, and password less ssh is working.
>
> I have hadoop 2.7.3, java 1.8 u111, zookeeper 3.4.6.
>
> I have 6 systems set up: tc1, tc2, tc3, tc4, tc5, tc6.
> tc1 is the active namenode
> tc2 is the passive namenode
> tc3 -> tc6 are data nodes.
> tc1, tc2 and tc3 are the Journal Nodes.
>
> Here is the tail end of the log when I try to start zookeeper on tc1:
>
> 2016-12-06 00:59:27,214 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/data/tools/repository/hadoop-2.7.3/lib/native
> 2016-12-06 00:59:27,214 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
> 2016-12-06 00:59:27,214 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
> 2016-12-06 00:59:27,215 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
> 2016-12-06 00:59:27,215 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
> 2016-12-06 00:59:27,215 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=4.8.11-1.el7.elrepo.x86_64
> 2016-12-06 00:59:27,215 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=iot-user
> 2016-12-06 00:59:27,215 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/iot-user
> 2016-12-06 00:59:27,215 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/data/tools/repository/hadoop-2.7.3
> 2016-12-06 00:59:27,216 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection,
connectString= tc1:2181,tc2:2181,tc3:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@223d2c72
> 2016-12-06 00:59:27,230 FATAL org.apache.hadoop.hdfs.tools.DFSZKFailoverController: Got
a fatal error, exiting now
> java.net.UnknownHostException:  tc1: Name or service not known
> 	at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
> 	at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
> 	at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
> 	at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
> 	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
> 	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
> 	at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61)
> 	at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
> 	at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380)
> 	at org.apache.hadoop.ha.ActiveStandbyElector.getNewZooKeeper(ActiveStandbyElector.java:631)
> 	at org.apache.hadoop.ha.ActiveStandbyElector.createConnection(ActiveStandbyElector.java:775)
> 	at org.apache.hadoop.ha.ActiveStandbyElector.<init>(ActiveStandbyElector.java:229)
> 	at org.apache.hadoop.ha.ZKFailoverController.initZK(ZKFailoverController.java:351)
> 	at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:191)
>
>
> It looks like hostname resolution is working:
>
> [iot-user@tc1 ~]$ hostname
> tc1
> [iot-user@tc1 ~]$ ssh tc2
> Last login: Mon Dec  5 22:31:21 2016 from 172.31.61.165
>
> [iot-user@tc2 ~]$ exit
> logout
> Connection to tc2 closed.
> [iot-user@tc1 ~]$ ssh tc3
> Last login: Mon Dec  5 23:06:06 2016 from 172.31.61.165
>
> [iot-user@tc3 ~]$ exit
> logout
> Connection to tc3 closed.
> [iot-user@tc1 ~]$ arp tc1
> tc1 (172.31.61.165) -- no entry
>
> [iot-user@tc1 repository]$ jps
> 2096 NameNode
> 10548 Jps
> 2342 JournalNode
> [iot-user@tc1 repository]$ hdfs haadmin -getServiceState tc1
> active
> [iot-user@tc1 repository]$ hdfs haadmin -getServiceState tc2
> standby
> [iot-user@tc1 repository]$
>
> Has anyone seen this error?  What am I doing wrong?
>
>
>
> *Michael Garcia*
>
> *Cloud Operations Engineer | North America*
>
>
>
> *Mobile +1 949 664 1431 <(949)%20664-1431>  *
>
> *7310 Miramar Road, Suite 440, San Diego, CA 92126*
>
>
>
>
>
>
>
>
>
>
>
>

Mime
View raw message