incubator-flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Hsieh <...@cloudera.com>
Subject Re: Multiple Flume Masters on EC2
Date Fri, 19 Aug 2011 07:12:29 GMT
Steve,

Please note that the multiple masters is still in early status so we haven't
ironed out all the problems here yet.

Have you tried restarting the masters that don't connect?  Does it
eventually succeed?

Jon.

On Thu, Aug 4, 2011 at 3:00 PM, Jonathan Hsieh <jon@cloudera.com> wrote:

> Forwarding direct response to flume-user@incubator.apache.org and
> cdh-user@cloudera.org.
>
> ---------- Forwarded message ----------
> From: flume collector <flume@collector.org>
> Date: Wed, Aug 3, 2011 at 5:56 AM
> Subject: Multiple Flume Masters on EC2
> To: jon@cloudera.com
>
>
> Jon,
>
> Thanks for replying to
> https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/6a8c44a782518a88#.
> I wasn't able to reply in the google group since this email is not a google
> account.
> Just to follow up though, I added the internal ip address and there
> are/were no extra spaces in the flume.master.servers:
>
>    <property>
>     <name>flume.master.servers</name>
>     <value>10.192.122.191,10.254.23.16,10.2.31.65</value>
>     <description>This is the address for the config servers status
>     server (http) </description>
>   </property>
>
> Also, each master has it's own master server id.  ex:
>
>   <property>
>     <name>flume.master.serverid</name>
>     <value>2</value>
>     <description>The unique identifier for a machine in a
>       Flume Master ensemble. Must be different on every
>       master instance.</description>
>   </property>
>
> Not sure how ec2 sets up their machines, there may be multiple nic's on
> them.
>
> Below are the errors I get with the ip addresses as the values within
> flume.master.servers:
>
> Thanks,
> -Steve
>
>
> Master 1:
>
> java.net.SocketTimeoutException
>     at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:109)
>     at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)
>     at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)
>     at
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)
>     at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)
> 2011-08-03 08:33:07,060 INFO
> org.apache.zookeeper.server.quorum.FastLeaderElection: Notification time
> out: 400
> 2011-08-03 08:33:08,893 INFO org.apache.zookeeper.server.NIOServerCnxn:
> Accepted socket connection from /10.192.122.191:49107
> 2011-08-03 08:33:08,893 INFO org.apache.zookeeper.server.NIOServerCnxn:
> Processing stat command from /10.192.122.191:49107
> 2011-08-03 08:33:08,894 INFO org.apache.zookeeper.server.NIOServerCnxn:
> Closed socket connection for client /10.192.122.191:49107 (no session
> established for client)
> 2011-08-03 08:33:10,896 INFO org.apache.zookeeper.server.NIOServerCnxn:
> Accepted socket connection from /10.192.122.191:49108
> 2011-08-03 08:33:10,897 INFO org.apache.zookeeper.server.NIOServerCnxn:
> Processing stat command from /10.192.122.191:49108
> 2011-08-03 08:33:10,897 INFO org.apache.zookeeper.server.NIOServerCnxn:
> Closed socket connection for client /10.192.122.191:49108 (no session
> established for client)
> 2011-08-03 08:33:12,060 WARN
> org.apache.zookeeper.server.quorum.QuorumCnxManager: Cannot open channel to
> 2 at election address /10.2.31.65:3183
>
>
>
>  Master 2:
>
> 557 [main] INFO com.cloudera.flume.master.FlumeMaster - Inferred master
> server index 1
> 607 [main] INFO com.cloudera.flume.master.FlumeMaster - Distributed master,
> disabling all config translations
> 618 [main] INFO com.cloudera.flume.master.FlumeMaster - Starting flume
> master on: domU-12-31-39-00-16-E2.compute-1.internal
> 618 [main] INFO com.cloudera.flume.master.FlumeMaster -  Working Directory
> is: /usr/lib/flume/.
> 621 [main] INFO com.cloudera.flume.master.ZooKeeperService - Starting
> ZooKeeper server as part of ensemble
> 631 [main] INFO com.cloudera.flume.master.ZKInProcessServer - Creating
> /var/flumedata/zk/server-1/myid
> 632 [main] INFO com.cloudera.flume.master.ZKInProcessServer -
> configuration: {server.2=10.2.31.65:3182:3183, server.1=10.254.23.16:3182:3183,
> server.0=10.192.122.191:3182:3183, initLimit=10, syncLimit=10,
> maxClientCnxns=0, clientPort=3181, tickTime=2000, electionAlg=3,
> dataDir=/var/flumedata/zk/server-1}
> 649 [main] INFO com.cloudera.flume.master.ZKInProcessServer - server
> 0.0.0.0:3181 not up yet
> 649 [ZooKeeper thread] INFO com.cloudera.flume.master.ZKInProcessServer -
> Starting ZooKeeper server
> 16689 [main] ERROR com.cloudera.flume.master.FlumeMaster - IO problem:
> ZooKeeper server did not come up within 15 seconds
>
>
>
> Master 3:
>
> 425 [main] INFO com.cloudera.flume.master.FlumeMaster - Inferred master
> server index 2
> 463 [main] INFO com.cloudera.flume.master.FlumeMaster - Distributed master,
> disabling all config translations
> 473 [main] INFO com.cloudera.flume.master.FlumeMaster - Starting flume
> master on: ip-10-2-31-65.ec2.internal
> 475 [main] INFO com.cloudera.flume.master.FlumeMaster -  Working Directory
> is: /usr/lib/flume/.
> 478 [main] INFO com.cloudera.flume.master.ZooKeeperService - Starting
> ZooKeeper server as part of ensemble
> 491 [main] INFO com.cloudera.flume.master.ZKInProcessServer - Creating
> /var/flumedata/zk/server-2/myid
> 493 [main] INFO com.cloudera.flume.master.ZKInProcessServer -
> configuration: {server.2=10.2.31.65:3182:3183, server.1=10.254.23.16:3182:3183,
> server.0=10.192.122.191:3182:3183, initLimit=10, syncLimit=10,
> maxClientCnxns=0, clientPort=3181, tickTime=2000, electionAlg=3,
> dataDir=/var/flumedata/zk/server-2}
> 501 [ZooKeeper thread] INFO com.cloudera.flume.master.ZKInProcessServer -
> Starting ZooKeeper server
> 505 [main] INFO com.cloudera.flume.master.ZKInProcessServer - server
> 0.0.0.0:3181 not up yet
> 16532 [main] ERROR com.cloudera.flume.master.FlumeMaster - IO problem:
> ZooKeeper server did not come up within 15 seconds
>
>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>
>
>


-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Mime
View raw message