aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Farner <wfar...@apache.org>
Subject Re: Aurora stuck in LEADER_AWAITING_REGISTRATION
Date Wed, 15 Oct 2014 22:42:08 GMT
Have you initialized the replicated log on this host?  This can be a
surprise, but it's a gate for a good reason.  Instructions here:

https://github.com/apache/incubator-aurora/blob/master/docs/deploying-aurora-scheduler.md#initializing-the-replicated-log

-=Bill

On Wed, Oct 15, 2014 at 2:22 PM, Dobromir Montauk <dobromir@tellapart.com>
wrote:

> Hi,
>
> I've brought up Aurora on my Mesos master node with the following command:
>
> ubuntu@ec2-54-82-17-37:~/$
>
> GLOG_v=2
> LIBPROCESS_PORT=5050
> LIBPROCESS_IP=127.0.0.1
> AURORA_HOME=/usr/local/aurora-scheduler
> DIST_DIR=/home/ubuntu/aurora-scheduler/dist
> AURORA_HOME=/usr/local/aurora-scheduler
>
> sudo /usr/local/aurora-scheduler/bin/aurora-scheduler \
>   -cluster_name=tellapart \
>   -http_port=8081 \
>   -native_log_quorum_size=1 \
>   -zk_endpoints=localhost:2181 \
>   -mesos_master_address=54.166.50.69:5050,54.160.61.169:5050,localhost:5050
> \
>   -serverset_path=/aurora/scheduler \
>   -native_log_zk_group_path=/aurora/replicated-log \
>   -native_log_file_path=$AURORA_HOME/scheduler/db \
>   -backup_dir=$AURORA_HOME/scheduler/backups \
>   -thermos_executor_path=/dev/null \
>   -gc_executor_path=$DIST_DIR/gc_executor.pex \
>   -enable_beta_updater=true \
>   -vlog=INFO \
>   -logtostderr
>
> Attached is the entire log, but basically I'm seeing this:
>
> I1015 21:18:05.315263 27634 group.cpp:313] Group process (group(1)@
> 10.88.26.227:40393) connected to ZooKeeper
> I1015 21:18:05.315322 27634 group.cpp:787] Syncing group operations: queue
> size (joins, cancels, datas) = (0, 0, 0)
> I1015 21:18:05.315348 27634 group.cpp:385] Trying to create path
> '/aurora/replicated-log' in ZooKeeper
> I1015 21:18:05.316 THREAD1
> com.twitter.common.zookeeper.CandidateImpl$4.onGroupChange: Candidate
> /aurora/scheduler/singleton_candidate_0000000008 is now leader of group:
> [singleton_candidate_0000000008]
> I1015 21:18:05.317 THREAD1
> com.twitter.common.util.StateMachine$Builder$1.execute: SchedulerLifecycle
> state machine transition STORAGE_PREPARED -> LEADER_AWAITING_REGISTRATION
> I1015 21:18:05.317 THREAD1
> org.apache.aurora.scheduler.SchedulerLifecycle$6.execute: Elected as
> leading scheduler!
> I1015 21:18:05.330394 27634 network.hpp:423] ZooKeeper group memberships
> changed
> I1015 21:18:05.330660 27639 group.cpp:658] Trying to get
> '/aurora/replicated-log/0000000008' in ZooKeeper
> I1015 21:18:05.331550 27635 network.hpp:461] ZooKeeper group PIDs: {
> log-replica(1)@10.88.26.227:40393 }
> I1015 21:18:06.027016 27634 replica.cpp:638] Replica in EMPTY status
> received a broadcasted recover request
> I1015 21:18:06.027216 27634 recover.cpp:188] Received a recover response
> from a replica in EMPTY status
> <repeat last 2 message ad nauseum>
>
> How can I debug what's going on?
>
> Thanks,
> Dobromir
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message