hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Minho Kim <mi...@apache.org>
Subject Re: Need to increase the default number of connections to zookeeper
Date Fri, 10 Jul 2015 00:41:51 GMT
Okay, Thanks Edward.

Best Regards,
Minho Kim

2015-07-08 14:21 GMT+09:00 Edward J. Yoon <edwardyoon@apache.org>:

> Just FYI,
>
> I just committed below:
>
> Index: conf/hama-default.xml
> ===================================================================
> --- conf/hama-default.xml (revision 1689791)
> +++ conf/hama-default.xml (working copy)
> @@ -262,7 +262,7 @@
>    </property>
>    <property>
>      <name>hama.zookeeper.property.maxClientCnxns</name>
> -    <value>30</value>
> +    <value>100</value>
>      <description>Property from ZooKeeper's config zoo.cfg.
>      Limit on number of concurrent connections (at the socket level) that a
>
>
> On Tue, Jul 7, 2015 at 9:17 PM, Minho Kim <minho@apache.org> wrote:
> > Oops,
> > I made a mistake. Edward is right. Each node has 192G RAM.
> >
> > Thanks,
> > Minho Kim
> >
> > 2015-07-07 19:50 GMT+09:00 Edward J. Yoon <edwardyoon@apache.org>:
> >
> >> > - 8 GB RAM
> >>
> >> I guess it looks like a typo Minho. :-) AFAIK, each node has 192GB
> memory.
> >>
> >> +1 we need to increase the default maxClientCnxns since modern
> >> machines have enough RAM.
> >>
> >> On Tue, Jul 7, 2015 at 7:13 PM, 김민호 <minwise.kim@samsung.com> wrote:
> >> > Hi all,
> >> >
> >> >
> >> >
> >> > Recently, I set up Hama cluster using 2 machines.
> >> >
> >> > This specification is as follows:
> >> >
> >> > - 8 GB RAM
> >> >
> >> > - 12 TB HDD
> >> >
> >> > - (I don’t remember CPU spec.)
> >> >
> >> >
> >> >
> >> > In order to run hama job, I set up configuration, bsp.tasks.maximum=40
> >> and
> >> > bsp.child.java.opts=-Xmx4096m, in hama-site.xml. (skip rests of
> >> settings.)
> >> >
> >> > So I performed examples which are pi Estimator and FastGraphGen but I
> got
> >> > below errors.
> >> >
> >> >
> >> >
> >> > attempt_201507071627_0001_000023_0:
> >> > org.apache.zookeeper.KeeperException$ConnectionLossException:
> >> > KeeperErrorCode = ConnectionLoss for
> >> > /bsp/job_201507071627_0001/peers/cluster-0:61029
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.hama.bsp.sync.ZKSyncClient.isExists(ZKSyncClient.java:108)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:261)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> >
> >>
> org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.registerTask(ZooKeeperSyncC
> >> > lientImpl.java:279)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> >
> >>
> org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.register(ZooKeeperSyncClien
> >> > tImpl.java:261)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> org.apache.hama.bsp.BSPPeerImpl.
> >> > initializeSyncService(BSPPeerImpl.java:305)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> org.apache.hama.bsp.BSPPeerImpl.
> >> > <init>(BSPPeerImpl.java:185)
> >> >
> >> > attempt_201507071627_0001_000023_0:     at
> >> >
> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1251)
> >> >
> >> > attempt_201507071627_0001_000023_0: 15/07/07 16:27:40 ERROR
> >> > sync.ZKSyncClient: Error creating zk path
> >> > /bsp/job_201507071627_0001/peers/cluster-0:61029
> >> >
> >> > attempt_201507071627_0001_000023_0:
> >> > org.apache.zookeeper.KeeperException$ConnectionLossException:
> >> > KeeperErrorCode = ConnectionLoss for /bsp
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> >
> org.apache.hama.bsp.sync.ZKSyncClient.createZnode(ZKSyncClient.java:135)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:281)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> >
> >>
> org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.registerTask(ZooKeeperSyncC
> >> > lientImpl.java:279)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> >
> >>
> org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.register(ZooKeeperSyncClien
> >> > tImpl.java:261)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> org.apache.hama.bsp.BSPPeerImpl.
> >> > initializeSyncService(BSPPeerImpl.java:305)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> org.apache.hama.bsp.BSPPeerImpl.
> >> > <init>(BSPPeerImpl.java:185)
> >> >
> >> > attempt_201507071627_0001_000023_0:     at
> >> >
> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1251)
> >> >
> >> > attempt_201507071627_0001_000023_0: 15/07/07 16:27:42 ERROR
> >> > sync.ZKSyncClient: Error checking zk path
> >> /bsp/job_201507071627_0001/sync/-1
> >> >
> >> > attempt_201507071627_0001_000023_0:
> >> > org.apache.zookeeper.KeeperException$ConnectionLossException:
> >> > KeeperErrorCode = ConnectionLoss for
> /bsp/job_201507071627_0001/sync/-1
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.hama.bsp.sync.ZKSyncClient.isExists(ZKSyncClient.java:108)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:261)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> >
> >>
> org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncC
> >> > lientImpl.java:100)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> org.apache.hama.bsp.BSPPeerImpl.
> >> > doFirstSync(BSPPeerImpl.java:312)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> org.apache.hama.bsp.BSPPeerImpl.
> >> > <init>(BSPPeerImpl.java:238)
> >> >
> >> > attempt_201507071627_0001_000023_0:     at
> >> >
> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1251)
> >> >
> >> > attempt_201507071627_0001_000023_0: 15/07/07 16:27:44 ERROR
> >> > sync.ZKSyncClient: Error creating zk path
> >> /bsp/job_201507071627_0001/sync/-1
> >> >
> >> > attempt_201507071627_0001_000023_0:
> >> > org.apache.zookeeper.KeeperException$ConnectionLossException:
> >> > KeeperErrorCode = ConnectionLoss for /bsp
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> >
> org.apache.hama.bsp.sync.ZKSyncClient.createZnode(ZKSyncClient.java:135)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> > org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:281)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> >
> >>
> org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncC
> >> > lientImpl.java:100)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> org.apache.hama.bsp.BSPPeerImpl.
> >> > doFirstSync(BSPPeerImpl.java:312)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> org.apache.hama.bsp.BSPPeerImpl.
> >> > <init>(BSPPeerImpl.java:238)
> >> >
> >> > attempt_201507071627_0001_000023_0:     at
> >> >
> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1251)
> >> >
> >> > attempt_201507071627_0001_000023_0: 15/07/07 16:27:46 FATAL
> >> > bsp.GroomServer: SyncError from child
> >> >
> >> > attempt_201507071627_0001_000023_0:
> >> org.apache.hama.bsp.sync.SyncException
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> >
> >>
> org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncC
> >> > lientImpl.java:138)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> org.apache.hama.bsp.BSPPeerImpl.
> >> > doFirstSync(BSPPeerImpl.java:312)
> >> >
> >> > attempt_201507071627_0001_000023_0:      at
> >> org.apache.hama.bsp.BSPPeerImpl.
> >> > <init>(BSPPeerImpl.java:238)
> >> >
> >> > attempt_201507071627_0001_000023_0:     at
> >> >
> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1251)
> >> >
> >> > 15/07/07 16:27:48 INFO bsp.BSPJobClient: Job failed.
> >> >
> >> >
> >> >
> >> > This is a ZK error. Hama tasks try to get the /bsp node from zookeeper
> >> and
> >> > fails.
> >> >
> >> > This is just because hama.zookeeper.property.maxClientCnxns is 30 in
> >> hama-
> >> > default.xml.
> >> >
> >> > The problem has been encountered while the number of maximum tasks is
> >> > larger than it.
> >> >
> >> > To solve the problem, Hama has a setting to increase the number of
> >> > connectiosns to ZK.
> >> >
> >> >
> >> >
> >> > <property>
> >> >
> >> >     <name>hama.zookeeper.property.maxClientCnxns</name>
> >> >
> >> >     <value>100</value>
> >> >
> >> > </property>
> >> >
> >> >
> >> >
> >> > So we should update the default number of connections which is over
> 100
> >> > because server’s performance has been more improved than before.
> >> >
> >> > If you agree my opinion, I will change the default value as 300.
> >> >
> >> >
> >> >
> >> > Best regards,
> >> >
> >> > Minho Kim
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >>
>
>
>
> --
> Best Regards, Edward J. Yoon
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message