aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Niemitz <st...@tellapart.com>
Subject Re: Aurora.pex client can't find scheduler
Date Tue, 17 Feb 2015 21:18:37 GMT
Is there a reason you set zk_in_proc=true?  Setting it tells the scheduler
to ignore the "real" ZK server and use an in-proc one instead.

-zk_in_proc=false
Launches an embedded zookeeper server for local testing causing
-zk_endpoints to be ignored if specified.
(com.twitter.common.zookeeper.guice.client.flagged.FlaggedClientConfig.zk_in_proc)

On Tue, Feb 17, 2015 at 4:09 PM, Xasima <xasima@gmail.com> wrote:

> Hello. I'm bump in into following problems when trying to perform the very
> first 'aurora.pex job create' command.
> 1) 'Could not connect to scheduler: No schedulers detected in devcluster'
> and
> 2) 'Failed to connect to Zookeeper within 10 seconds.'
>
> It had tried to check everything in  configurations, but I can't find the
> root of the problem so far. I have zookeeper, mesos-master,  mesos-slave,
> and aurora-scheduler running on the same server. The little difference from
> the default vagrant/example configuration is the usage of non default
> http_port  for aurora scheduler.
>
> Namely, I have  aurora scheduler  running with the following  /vars prop
>
> *jvm_prop_sun_java_command *org.apache.aurora.scheduler.app.SchedulerMain
>
> -thermos_executor_path=/opt/apache-aurora-0.7.0-incubating/dist/thermos_executor.pex
> -gc_executor_path=/opt/apache-aurora-0.7.0-incubating/dist/gc_executor.pex
> -http_port=8091 -zk_in_proc=true -zk_endpoints=localhost:2181
> -zk_session_timeout=2secs -serverset_path=/aurora/scheduler
> -mesos_master_address=zk://localhost:2181/mesos -cluster_name=devcluster
> -native_log_quorum_size=1
> -native_log_file_path=/usr/local/aurora-scheduler/db
> -native_log_zk_group_path=/local/service/mesos-native-log
> -backup_dir=/usr/local/aurora-scheduler/backups -logtostderr -vlog=INFO
>
> and here is the successful tail of aurora-scheduler log
>
> W0217 20:42:25.952 THREAD140
> com.twitter.common.zookeeper.ServerSetImpl.join: Joining a ServerSet
> without a shard ID is deprecated and will soon break.
>  com.twitter.common.zookeeper.Group$ActiveMembership.join: Set group member
> ID to member_0000000001
>
> I0217 20:42:26.026 THREAD132
> com.twitter.common.zookeeper.ServerSetImpl$ServerSetWatcher.logChange:
> server set /aurora/scheduler change: from 0 members to 1
>         joined:
>
> ServiceInstance(serviceEndpoint:Endpoint(host:bymsq-bsu-hmetrics002,
> port:8091), additionalEndpoints:{http=Endpoint(host:bymsq-bsu-hmetrics002,
> port:8091)}, status:ALIVE)
>
> I0217 20:42:26.026 THREAD132
> org.apache.aurora.scheduler.http.LeaderRedirect$SchedulerMonitor.onChange:
> Found leader scheduler at
> [ServiceInstance(serviceEndpoint:Endpoint(host:bymsq-bsu-hmetrics002,
> port:8091), additionalEndpoints:{http=Endpoint(host:bymsq-bsu-hmetrics002,
> port:8091)}, status:ALIVE)]
>
> Not sure, if this is suspicious, but I see in zookeeper
> /local/service/mesos-native-log/0000000010 node, and /mesos/info_000000003
> nodes, but there are no /aurora/scheduler node.
>
> The configuration file /etc/aurora/clusters.json points  to zk with proper
> scheduler_zk_path. All *.pex files are built with pants against appropriate
> build or downloaded AURORA_DIST/third_party/mesos_*.egg.   This gist
> contains all the details on my configurations
> https://gist.github.com/xasima/12de906475d70523316a
>
>  Nevertheless, the very trivial hello_world service fails to run with
> errors on
>  WARN] Could not connect to scheduler: No schedulers detected in
> devcluster!
> WARN] Could not connect to scheduler: Failed to connect to Zookeeper within
> 10 seconds.
>
> Could please someone help and examine the configuration above?
>
> --
> Best regards,
>      ~ Xasima ~
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message