mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chun-Hung Hsiao (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (MESOS-4062) SlaveTest.MetricsSlaveLaunchErrors is flaky
Date Fri, 08 Jun 2018 23:48:00 GMT

    [ https://issues.apache.org/jira/browse/MESOS-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506708#comment-16506708
] 

Chun-Hung Hsiao edited comment on MESOS-4062 at 6/8/18 11:47 PM:
-----------------------------------------------------------------

Observed another type of flakiness on this test: an 1-hr hang: [^consoleText.txt]. Also see
MESOS-6231.


was (Author: chhsia0):
Observed another type of flakiness on this test. See MESOS-6231.

> SlaveTest.MetricsSlaveLaunchErrors is flaky
> -------------------------------------------
>
>                 Key: MESOS-4062
>                 URL: https://issues.apache.org/jira/browse/MESOS-4062
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>            Reporter: Vinod Kone
>            Priority: Major
>              Labels: flaky-test, test
>         Attachments: consoleText.txt
>
>
> Observed this in internal CI
> {code}
> DEBUG: [ RUN      ] SlaveTest.MetricsSlaveLaunchErrors
> DEBUG: I1126 10:07:30.408905 33823 leveldb.cpp:174] Opened db in 412.443316ms
> DEBUG: I1126 10:07:30.559861 33823 leveldb.cpp:181] Compacted db in 150.905148ms
> DEBUG: I1126 10:07:30.559921 33823 leveldb.cpp:196] Created db iterator in 6415ns
> DEBUG: I1126 10:07:30.559936 33823 leveldb.cpp:202] Seeked to beginning of db in 1135ns
> DEBUG: I1126 10:07:30.559944 33823 leveldb.cpp:271] Iterated through 0 keys in the db
in 320ns
> DEBUG: I1126 10:07:30.559967 33823 replica.cpp:778] Replica recovered with log positions
0 -> 0 with 1 holes and 0 unlearned
> DEBUG: I1126 10:07:30.560356 33841 recover.cpp:447] Starting replica recovery
> DEBUG: I1126 10:07:30.561079 33845 recover.cpp:473] Replica is in EMPTY status
> DEBUG: I1126 10:07:30.561751 33842 replica.cpp:674] Replica in EMPTY status received
a broadcasted recover request from (8555)@a.b.c.d:33492
> DEBUG: I1126 10:07:30.561903 33847 recover.cpp:193] Received a recover response from
a replica in EMPTY status
> DEBUG: I1126 10:07:30.562270 33857 recover.cpp:564] Updating replica status to STARTING
> DEBUG: I1126 10:07:30.563334 33844 master.cpp:365] Master dab6e888-2e40-45e4-9595-051febe7dc16
(x.com) started on a.b.c.d:33492
> DEBUG: I1126 10:07:30.563357 33844 master.cpp:367] Flags at startup: --acls="" --allocation_interval="1secs"
--allocator="HierarchicalDRF" --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5"
--authorizers="local" --credentials="/tmp/2xHAZi/credentials" --framework_sorter="drf" --help="false"
--hostname_lookup="true" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0"
--logging_level="INFO" --max_slave_ping_timeouts="5" --quiet="false" --recovery_slave_removal_limit="100%"
--registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="25secs"
--registry_strict="true" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins"
--user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/2xHAZi/master"
--zk_session_timeout="10secs"
> DEBUG: I1126 10:07:30.563581 33844 master.cpp:412] Master only allowing authenticated
frameworks to register
> DEBUG: I1126 10:07:30.563596 33844 master.cpp:417] Master only allowing authenticated
slaves to register
> DEBUG: I1126 10:07:30.563604 33844 credentials.hpp:35] Loading credentials for authentication
from '/tmp/2xHAZi/credentials'
> DEBUG: I1126 10:07:30.564956 33844 master.cpp:456] Using default 'crammd5' authenticator
> DEBUG: I1126 10:07:30.565018 33844 master.cpp:493] Authorization enabled
> DEBUG: I1126 10:07:30.565817 33849 master.cpp:1625] The newly elected leader is master@a.b.c.d:33492
with id dab6e888-2e40-45e4-9595-051febe7dc16
> DEBUG: I1126 10:07:30.565834 33849 master.cpp:1638] Elected as the leading master!
> DEBUG: I1126 10:07:30.565841 33849 master.cpp:1383] Recovering from registrar
> DEBUG: I1126 10:07:30.565906 33844 registrar.cpp:307] Recovering registrar
> DEBUG: I1126 10:07:30.676245 33859 leveldb.cpp:304] Persisting metadata (8 bytes) to
leveldb took 113.597908ms
> DEBUG: I1126 10:07:30.676311 33859 replica.cpp:321] Persisted replica status to STARTING
> DEBUG: I1126 10:07:30.676458 33851 recover.cpp:473] Replica is in STARTING status
> DEBUG: I1126 10:07:30.677732 33854 replica.cpp:674] Replica in STARTING status received
a broadcasted recover request from (8556)@a.b.c.d:33492
> DEBUG: I1126 10:07:30.677914 33846 recover.cpp:193] Received a recover response from
a replica in STARTING status
> DEBUG: I1126 10:07:30.678324 33862 recover.cpp:564] Updating replica status to VOTING
> DEBUG: I1126 10:07:30.777760 33851 leveldb.cpp:304] Persisting metadata (8 bytes) to
leveldb took 98.972607ms
> DEBUG: I1126 10:07:30.777823 33851 replica.cpp:321] Persisted replica status to VOTING
> DEBUG: I1126 10:07:30.777905 33851 recover.cpp:578] Successfully joined the Paxos group
> DEBUG: I1126 10:07:30.777976 33851 recover.cpp:462] Recover process terminated
> DEBUG: I1126 10:07:30.778120 33846 log.cpp:659] Attempting to start the writer
> DEBUG: I1126 10:07:30.779547 33856 replica.cpp:494] Replica received implicit promise
request from (8557)@a.b.c.d:33492 with proposal 1
> DEBUG: I1126 10:07:30.879788 33856 leveldb.cpp:304] Persisting metadata (8 bytes) to
leveldb took 100.197892ms
> DEBUG: I1126 10:07:30.879844 33856 replica.cpp:343] Persisted promised to 1
> DEBUG: I1126 10:07:30.880520 33851 coordinator.cpp:238] Coordinator attempting to fill
missing positions
> DEBUG: I1126 10:07:30.883019 33864 replica.cpp:389] Replica received explicit promise
request from (8558)@a.b.c.d:33492 for position 0 with proposal 2
> DEBUG: I1126 10:07:31.017720 33864 leveldb.cpp:341] Persisting action (8 bytes) to leveldb
took 134.647958ms
> DEBUG: I1126 10:07:31.017788 33864 replica.cpp:713] Persisted action at 0
> DEBUG: I1126 10:07:31.019820 33854 replica.cpp:538] Replica received write request for
position 0 from (8559)@a.b.c.d:33492
> DEBUG: I1126 10:07:31.019881 33854 leveldb.cpp:436] Reading position from leveldb took
21951ns
> DEBUG: I1126 10:07:31.125607 33854 leveldb.cpp:341] Persisting action (14 bytes) to leveldb
took 105.695296ms
> DEBUG: I1126 10:07:31.125684 33854 replica.cpp:713] Persisted action at 0
> DEBUG: I1126 10:07:31.126045 33855 replica.cpp:692] Replica received learned notice for
position 0 from @0.0.0.0:0
> DEBUG: I1126 10:07:31.378890 33855 leveldb.cpp:341] Persisting action (16 bytes) to leveldb
took 252.798894ms
> DEBUG: I1126 10:07:31.378969 33855 replica.cpp:713] Persisted action at 0
> DEBUG: I1126 10:07:31.378989 33855 replica.cpp:698] Replica learned NOP action at position
0
> DEBUG: I1126 10:07:31.379464 33856 log.cpp:675] Writer started with ending position 0
> DEBUG: I1126 10:07:31.380209 33843 leveldb.cpp:436] Reading position from leveldb took
29715ns
> DEBUG: I1126 10:07:31.380609 33842 registrar.cpp:340] Successfully fetched the registry
(0B) in 814.676992ms
> DEBUG: I1126 10:07:31.380666 33842 registrar.cpp:439] Applied 1 operations in 4717ns;
attempting to update the 'registry'
> DEBUG: I1126 10:07:31.380982 33842 log.cpp:683] Attempting to append 228 bytes to the
log
> DEBUG: I1126 10:07:31.381053 33842 coordinator.cpp:348] Coordinator attempting to write
APPEND action at position 1
> DEBUG: I1126 10:07:31.381356 33842 replica.cpp:538] Replica received write request for
position 1 from (8560)@a.b.c.d:33492
> DEBUG: 2015-11-26 10:07:31,923:33823(0x7f5e2a7fc700):ZOO_ERROR@handle_socket_error_msg@1697:
Socket [127.0.0.1:57854] zk retcode=-4, errno=111(Connection refused): server refused to accept
the client
> DEBUG: I1126 10:07:32.385574 33842 leveldb.cpp:341] Persisting action (247 bytes) to
leveldb took 1.004179569secs
> DEBUG: I1126 10:07:32.385628 33842 replica.cpp:713] Persisted action at 1
> DEBUG: I1126 10:07:32.385962 33842 replica.cpp:692] Replica received learned notice for
position 1 from @0.0.0.0:0
> DEBUG: I1126 10:07:32.886503 33842 leveldb.cpp:341] Persisting action (249 bytes) to
leveldb took 500.504268ms
> DEBUG: I1126 10:07:32.886557 33842 replica.cpp:713] Persisted action at 1
> DEBUG: I1126 10:07:32.886567 33842 replica.cpp:698] Replica learned APPEND action at
position 1
> DEBUG: I1126 10:07:32.887076 33844 registrar.cpp:484] Successfully updated the 'registry'
in 1.506383104secs
> DEBUG: I1126 10:07:32.887090 33842 log.cpp:702] Attempting to truncate the log to 1
> DEBUG: I1126 10:07:32.887162 33844 registrar.cpp:370] Successfully recovered registrar
> DEBUG: I1126 10:07:32.887202 33848 coordinator.cpp:348] Coordinator attempting to write
TRUNCATE action at position 2
> DEBUG: I1126 10:07:32.888511 33861 master.cpp:1435] Recovered 0 slaves from the Registry
(189B) ; allowing 10mins for slaves to re-register
> DEBUG: I1126 10:07:32.888545 33845 replica.cpp:538] Replica received write request for
position 2 from (8561)@a.b.c.d:33492
> DEBUG: I1126 10:07:32.888587 33850 hierarchical.cpp:174] Allocator recovery is not supported
yet
> DEBUG: I1126 10:07:34.422766 33845 leveldb.cpp:341] Persisting action (16 bytes) to leveldb
took 1.534182617secs
> DEBUG: I1126 10:07:34.422826 33845 replica.cpp:713] Persisted action at 2
> DEBUG: I1126 10:07:34.423280 33848 replica.cpp:692] Replica received learned notice for
position 2 from @0.0.0.0:0
> DEBUG: 2015-11-26 10:07:35,260:33823(0x7f5e2a7fc700):ZOO_ERROR@handle_socket_error_msg@1697:
Socket [127.0.0.1:57854] zk retcode=-4, errno=111(Connection refused): server refused to accept
the client
> DEBUG: I1126 10:07:36.023536 33848 leveldb.cpp:341] Persisting action (18 bytes) to leveldb
took 1.600200148secs
> DEBUG: I1126 10:07:36.023653 33848 leveldb.cpp:399] Deleting ~1 keys from leveldb took
49874ns
> DEBUG: I1126 10:07:36.023669 33848 replica.cpp:713] Persisted action at 2
> DEBUG: I1126 10:07:36.023680 33848 replica.cpp:698] Replica learned TRUNCATE action at
position 2
> DEBUG: I1126 10:07:36.028640 33862 slave.cpp:189] Slave started on 313)@a.b.c.d:33492
> DEBUG: I1126 10:07:36.028666 33862 slave.cpp:190] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc"
--authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs"
--containerizers="mesos" --credential="/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/credential"
--default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_auth_server="auth.docker.io"
--docker_auth_server_port="443" --docker_kill_orphans="true" --docker_local_archives_dir="/tmp/mesos/images/docker"
--docker_puller="local" --docker_puller_timeout="60" --docker_registry="registry-1.docker.io"
--docker_registry_port="443" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock"
--docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --egress_unique_flow_per_container="false"
--enforce_container_disk_quota="false" --ephemeral_ports_per_container="1024" --executor_registration_timeout="1mins"
--executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/fetch"
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1"
--hadoop_home="" --help="false" --hostname_lookup="true" --image_provisioner_backend="copy"
--initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher_dir="/builddir/build/BUILD/mesos-0.27.0/src"
--logbufsecs="0" --logging_level="INFO" --network_enable_socket_statistics_details="false"
--network_enable_socket_statistics_summary="false" --oversubscribed_resources_interval="15secs"
--perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false"
--recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]"
--revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --strict="true"
--switch_user="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9"
> DEBUG: I1126 10:07:36.029013 33862 credentials.hpp:83] Loading credential for authentication
from '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/credential'
> DEBUG: I1126 10:07:36.029131 33862 slave.cpp:320] Slave using credential for: test-principal
> DEBUG: I1126 10:07:36.029939 33862 slave.cpp:390] Slave resources: cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000]
> DEBUG: I1126 10:07:36.029979 33862 slave.cpp:398] Slave attributes: [  ]
> DEBUG: I1126 10:07:36.029989 33862 slave.cpp:403] Slave hostname: x.com
> DEBUG: I1126 10:07:36.029999 33862 slave.cpp:408] Slave checkpoint: true
> DEBUG: I1126 10:07:36.030452 33852 state.cpp:52] Recovering state from '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/meta'
> DEBUG: I1126 10:07:36.030635 33849 status_update_manager.cpp:200] Recovering status update
manager
> DEBUG: I1126 10:07:36.030804 33861 slave.cpp:4258] Finished recovery
> DEBUG: I1126 10:07:36.030957 33823 sched.cpp:164] Version: 0.27.0-rc0
> DEBUG: I1126 10:07:36.031323 33860 sched.cpp:262] New master detected at master@a.b.c.d:33492
> DEBUG: I1126 10:07:36.031355 33860 sched.cpp:318] Authenticating with master master@a.b.c.d:33492
> DEBUG: I1126 10:07:36.031363 33860 sched.cpp:325] Using default CRAM-MD5 authenticatee
> DEBUG: I1126 10:07:36.031898 33846 authenticatee.cpp:121] Creating new client SASL connection
> DEBUG: I1126 10:07:36.032002 33846 master.cpp:5169] Authenticating scheduler-86480915-9ac5-470d-8949-c826734478a8@a.b.c.d:33492
> DEBUG: I1126 10:07:36.032591 33859 authenticator.cpp:98] Creating new server SASL connection
> DEBUG: I1126 10:07:36.032668 33859 authenticatee.cpp:212] Received SASL authentication
mechanisms: CRAM-MD5
> DEBUG: I1126 10:07:36.032685 33859 authenticatee.cpp:238] Attempting to authenticate
with mechanism 'CRAM-MD5'
> DEBUG: I1126 10:07:36.032744 33859 authenticator.cpp:203] Received SASL authentication
start
> DEBUG: I1126 10:07:36.032814 33859 authenticator.cpp:325] Authentication requires more
steps
> DEBUG: I1126 10:07:36.032850 33859 authenticatee.cpp:258] Received SASL authentication
step
> DEBUG: I1126 10:07:36.033195 33848 authenticator.cpp:231] Received SASL authentication
step
> DEBUG: I1126 10:07:36.033231 33848 authenticator.cpp:317] Authentication success
> DEBUG: I1126 10:07:36.033293 33849 authenticatee.cpp:298] Authentication success
> DEBUG: I1126 10:07:36.033668 33851 master.cpp:5199] Successfully authenticated principal
'test-principal' at scheduler-86480915-9ac5-470d-8949-c826734478a8@a.b.c.d:33492
> DEBUG: I1126 10:07:36.033880 33856 sched.cpp:407] Successfully authenticated with master
master@a.b.c.d:33492
> DEBUG: I1126 10:07:36.034004 33842 master.cpp:2195] Received SUBSCRIBE call for framework
'default' at scheduler-86480915-9ac5-470d-8949-c826734478a8@a.b.c.d:33492
> DEBUG: I1126 10:07:36.034039 33842 master.cpp:1664] Authorizing framework principal 'test-principal'
to receive offers for role '*'
> DEBUG: I1126 10:07:36.034314 33845 master.cpp:2266] Subscribing framework default with
checkpointing disabled and capabilities [  ]
> DEBUG: I1126 10:07:36.034454 33845 sched.cpp:641] Framework registered with dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:36.034539 33850 hierarchical.cpp:220] Added framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:36.264459 33844 slave.cpp:727] New master detected at master@a.b.c.d:33492
> DEBUG: I1126 10:07:36.264487 33844 slave.cpp:790] Authenticating with master master@a.b.c.d:33492
> DEBUG: I1126 10:07:36.264494 33844 slave.cpp:795] Using default CRAM-MD5 authenticatee
> DEBUG: I1126 10:07:36.264529 33844 slave.cpp:763] Detecting new master
> DEBUG: I1126 10:07:36.264536 33856 status_update_manager.cpp:174] Pausing sending status
updates
> DEBUG: I1126 10:07:36.264717 33842 authenticatee.cpp:121] Creating new client SASL connection
> DEBUG: I1126 10:07:36.265007 33844 master.cpp:5169] Authenticating slave(313)@a.b.c.d:33492
> DEBUG: I1126 10:07:36.265116 33855 authenticator.cpp:98] Creating new server SASL connection
> DEBUG: I1126 10:07:36.265190 33855 authenticatee.cpp:212] Received SASL authentication
mechanisms: CRAM-MD5
> DEBUG: I1126 10:07:36.265207 33855 authenticatee.cpp:238] Attempting to authenticate
with mechanism 'CRAM-MD5'
> DEBUG: I1126 10:07:36.265398 33850 authenticator.cpp:203] Received SASL authentication
start
> DEBUG: I1126 10:07:36.265527 33850 authenticator.cpp:325] Authentication requires more
steps
> DEBUG: I1126 10:07:36.265702 33860 authenticatee.cpp:258] Received SASL authentication
step
> DEBUG: I1126 10:07:36.265918 33860 authenticator.cpp:231] Received SASL authentication
step
> DEBUG: I1126 10:07:36.266005 33860 authenticator.cpp:317] Authentication success
> DEBUG: I1126 10:07:36.266124 33847 authenticatee.cpp:298] Authentication success
> DEBUG: I1126 10:07:36.266181 33853 master.cpp:5199] Successfully authenticated principal
'test-principal' at slave(313)@a.b.c.d:33492
> DEBUG: I1126 10:07:36.266787 33853 slave.cpp:858] Successfully authenticated with master
master@a.b.c.d:33492
> DEBUG: I1126 10:07:36.266958 33863 master.cpp:3878] Registering slave at slave(313)@a.b.c.d:33492
(x.com) with id dab6e888-2e40-45e4-9595-051febe7dc16-S0
> DEBUG: I1126 10:07:36.267105 33858 registrar.cpp:439] Applied 1 operations in 22us; attempting
to update the 'registry'
> DEBUG: I1126 10:07:36.267534 33848 log.cpp:683] Attempting to append 419 bytes to the
log
> DEBUG: I1126 10:07:36.267640 33860 coordinator.cpp:348] Coordinator attempting to write
APPEND action at position 3
> DEBUG: I1126 10:07:36.268745 33857 replica.cpp:538] Replica received write request for
position 3 from (8566)@a.b.c.d:33492
> DEBUG: I1126 10:07:36.282333 33852 master.cpp:3866] Ignoring register slave message from
slave(313)@a.b.c.d:33492 (x.com) as admission is already in progress
> DEBUG: I1126 10:07:36.290721 33851 master.cpp:3866] Ignoring register slave message from
slave(313)@a.b.c.d:33492 (x.com) as admission is already in progress
> DEBUG: I1126 10:07:36.315064 33851 master.cpp:3866] Ignoring register slave message from
slave(313)@a.b.c.d:33492 (x.com) as admission is already in progress
> DEBUG: I1126 10:07:36.361470 33852 master.cpp:3866] Ignoring register slave message from
slave(313)@a.b.c.d:33492 (x.com) as admission is already in progress
> DEBUG: I1126 10:07:36.678863 33856 master.cpp:3866] Ignoring register slave message from
slave(313)@a.b.c.d:33492 (x.com) as admission is already in progress
> DEBUG: I1126 10:07:37.008486 33851 master.cpp:3866] Ignoring register slave message from
slave(313)@a.b.c.d:33492 (x.com) as admission is already in progress
> DEBUG: I1126 10:07:37.397876 33852 master.cpp:3866] Ignoring register slave message from
slave(313)@a.b.c.d:33492 (x.com) as admission is already in progress
> DEBUG: 2015-11-26 10:07:38,597:33823(0x7f5e2a7fc700):ZOO_ERROR@handle_socket_error_msg@1697:
Socket [127.0.0.1:57854] zk retcode=-4, errno=111(Connection refused): server refused to accept
the client
> DEBUG: I1126 10:07:38.727941 33857 leveldb.cpp:341] Persisting action (438 bytes) to
leveldb took 2.459156189secs
> DEBUG: I1126 10:07:38.728011 33857 replica.cpp:713] Persisted action at 3
> DEBUG: I1126 10:07:38.728584 33860 replica.cpp:692] Replica received learned notice for
position 3 from @0.0.0.0:0
> DEBUG: I1126 10:07:39.243325 33847 master.cpp:3866] Ignoring register slave message from
slave(313)@a.b.c.d:33492 (x.com) as admission is already in progress
> DEBUG: I1126 10:07:40.169939 33860 leveldb.cpp:341] Persisting action (440 bytes) to
leveldb took 1.441303994secs
> DEBUG: I1126 10:07:40.169996 33860 replica.cpp:713] Persisted action at 3
> DEBUG: I1126 10:07:40.170007 33860 replica.cpp:698] Replica learned APPEND action at
position 3
> DEBUG: I1126 10:07:40.170461 33860 registrar.cpp:484] Successfully updated the 'registry'
in 3.903270144secs
> DEBUG: I1126 10:07:40.170742 33855 log.cpp:702] Attempting to truncate the log to 3
> DEBUG: I1126 10:07:40.170810 33841 coordinator.cpp:348] Coordinator attempting to write
TRUNCATE action at position 4
> DEBUG: I1126 10:07:40.170925 33844 master.cpp:3946] Registered slave dab6e888-2e40-45e4-9595-051febe7dc16-S0
at slave(313)@a.b.c.d:33492 (x.com) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> DEBUG: I1126 10:07:40.170999 33863 slave.cpp:902] Registered with master master@a.b.c.d:33492;
given slave ID dab6e888-2e40-45e4-9595-051febe7dc16-S0
> DEBUG: I1126 10:07:40.171005 33842 hierarchical.cpp:380] Added slave dab6e888-2e40-45e4-9595-051febe7dc16-S0
(x.com) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated: )
> DEBUG: I1126 10:07:40.171119 33845 status_update_manager.cpp:181] Resuming sending status
updates
> DEBUG: I1126 10:07:40.171352 33844 replica.cpp:538] Replica received write request for
position 4 from (8567)@a.b.c.d:33492
> DEBUG: I1126 10:07:40.171455 33851 master.cpp:4998] Sending 1 offers to framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
(default) at scheduler-86480915-9ac5-470d-8949-c826734478a8@a.b.c.d:33492
> DEBUG: I1126 10:07:40.310534 33863 slave.cpp:961] Forwarding total oversubscribed resources

> DEBUG: I1126 10:07:40.310897 33852 master.cpp:4288] Received update of slave dab6e888-2e40-45e4-9595-051febe7dc16-S0
at slave(313)@a.b.c.d:33492 (x.com) with total oversubscribed resources 
> DEBUG: I1126 10:07:40.311044 33852 hierarchical.cpp:436] Slave dab6e888-2e40-45e4-9595-051febe7dc16-S0
(x.com) updated with oversubscribed resources  (total: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000])
> DEBUG: I1126 10:07:40.314579 33862 master.cpp:2934] Processing ACCEPT call for offers:
[ dab6e888-2e40-45e4-9595-051febe7dc16-O0 ] on slave dab6e888-2e40-45e4-9595-051febe7dc16-S0
at slave(313)@a.b.c.d:33492 (x.com) for framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
(default) at scheduler-86480915-9ac5-470d-8949-c826734478a8@a.b.c.d:33492
> DEBUG: I1126 10:07:40.314610 33862 master.cpp:2730] Authorizing framework principal 'test-principal'
to launch task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 as user 'mockbuild'
> DEBUG: W1126 10:07:40.315390 33859 validation.cpp:420] Executor default for task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
uses less CPUs (None) than the minimum required (0.01). Please update your executor, as this
will be mandatory in future releases.
> DEBUG: W1126 10:07:40.315417 33859 validation.cpp:432] Executor default for task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
uses less memory (None) than the minimum required (32MB). Please update your executor, as
this will be mandatory in future releases.
> DEBUG: I1126 10:07:40.315511 33859 master.hpp:176] Adding task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
with resources cpus(*):1; mem(*):32 on slave dab6e888-2e40-45e4-9595-051febe7dc16-S0 (x.com)
> DEBUG: I1126 10:07:40.315538 33859 master.cpp:3264] Launching task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 (default) at scheduler-86480915-9ac5-470d-8949-c826734478a8@a.b.c.d:33492
with resources cpus(*):1; mem(*):32 on slave dab6e888-2e40-45e4-9595-051febe7dc16-S0 at slave(313)@a.b.c.d:33492
(x.com)
> DEBUG: I1126 10:07:40.315716 33842 slave.cpp:1292] Got assigned task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
for framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:40.315937 33842 slave.cpp:1411] Launching task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
for framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:40.315933 33850 hierarchical.cpp:793] Recovered cpus(*):1; mem(*):992;
disk(*):1024; ports(*):[31000-32000] (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000],
allocated: cpus(*):1; mem(*):32) on slave dab6e888-2e40-45e4-9595-051febe7dc16-S0 from framework
dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:40.994328 33842 paths.cpp:434] Trying to chown '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/slaves/dab6e888-2e40-45e4-9595-051febe7dc16-S0/frameworks/dab6e888-2e40-45e4-9595-051febe7dc16-0000/executors/default/runs/27d66117-c4df-4da2-b1fc-e1f473c96b67'
to user 'mockbuild'
> DEBUG: I1126 10:07:41.002754 33842 slave.cpp:5043] Launching executor default of framework
dab6e888-2e40-45e4-9595-051febe7dc16-0000 with resources  in work directory '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/slaves/dab6e888-2e40-45e4-9595-051febe7dc16-S0/frameworks/dab6e888-2e40-45e4-9595-051febe7dc16-0000/executors/default/runs/27d66117-c4df-4da2-b1fc-e1f473c96b67'
> DEBUG: I1126 10:07:41.002931 33842 slave.cpp:1629] Queuing task '6669df6d-7a2c-415f-a2c1-f2615ec6d7f6'
for executor 'default' of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: E1126 10:07:41.004542 33842 slave.cpp:3468] Container '27d66117-c4df-4da2-b1fc-e1f473c96b67'
for executor 'default' of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 failed to start:
Injected failure
> DEBUG: E1126 10:07:41.004598 33842 slave.cpp:3565] Termination of executor 'default'
of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 failed: Unknown container: 27d66117-c4df-4da2-b1fc-e1f473c96b67
> DEBUG: I1126 10:07:41.004650 33842 slave.cpp:2763] Handling status update TASK_FAILED
(UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847) for task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 from @0.0.0.0:0
> DEBUG: I1126 10:07:41.004907 33842 master.cpp:4535] Executor 'default' of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
on slave dab6e888-2e40-45e4-9595-051febe7dc16-S0 at slave(313)@a.b.c.d:33492 (x.com): terminated
with signal Unknown signal 127
> DEBUG: I1126 10:07:41.004919 33850 status_update_manager.cpp:320] Received status update
TASK_FAILED (UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847) for task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.004937 33842 master.cpp:6180] Removing executor 'default' with
resources  of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 on slave dab6e888-2e40-45e4-9595-051febe7dc16-S0
at slave(313)@a.b.c.d:33492 (x.com)
> DEBUG: I1126 10:07:41.005262 33861 slave.cpp:3115] Forwarding the update TASK_FAILED
(UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847) for task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 to master@a.b.c.d:33492
> DEBUG: I1126 10:07:41.005549 33860 master.cpp:4433] Status update TASK_FAILED (UUID:
3873b597-6af7-4b97-9ad2-4e0f285ae847) for task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework
dab6e888-2e40-45e4-9595-051febe7dc16-0000 from slave dab6e888-2e40-45e4-9595-051febe7dc16-S0
at slave(313)@a.b.c.d:33492 (x.com)
> DEBUG: I1126 10:07:41.005584 33860 master.cpp:4481] Forwarding status update TASK_FAILED
(UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847) for task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.005736 33860 master.cpp:6085] Updating the state of task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 (latest state: TASK_FAILED, status
update state: TASK_FAILED)
> DEBUG: I1126 10:07:41.005898 33846 hierarchical.cpp:793] Recovered cpus(*):1; mem(*):32
(total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: ) on slave
dab6e888-2e40-45e4-9595-051febe7dc16-S0 from framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.005977 33860 master.cpp:3590] Processing ACKNOWLEDGE call 3873b597-6af7-4b97-9ad2-4e0f285ae847
for task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
(default) at scheduler-86480915-9ac5-470d-8949-c826734478a8@a.b.c.d:33492 on slave dab6e888-2e40-45e4-9595-051febe7dc16-S0
> DEBUG: I1126 10:07:41.006007 33860 master.cpp:6151] Removing task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
with resources cpus(*):1; mem(*):32 of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
on slave dab6e888-2e40-45e4-9595-051febe7dc16-S0 at slave(313)@a.b.c.d:33492 (x.com)
> DEBUG: I1126 10:07:41.006250 33855 status_update_manager.cpp:392] Received status update
acknowledgement (UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847) for task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6
of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.006495 33846 slave.cpp:3685] Cleaning up executor 'default' of
framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: tests/slave_tests.cpp:1150: Failure
> DEBUG: Value of: snapshot.values["slave/container_launch_errors"]
> DEBUG:   Actual: 16-byte object <02-00 00-00 20-74 6F-20 00-49 27-03 00-00 00-00>
> DEBUG: Expected: 1
> DEBUG: I1126 10:07:41.006705 33846 slave.cpp:3773] Cleaning up framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.006731 33863 gc.cpp:54] Scheduling '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/slaves/dab6e888-2e40-45e4-9595-051febe7dc16-S0/frameworks/dab6e888-2e40-45e4-9595-051febe7dc16-0000/executors/default/runs/27d66117-c4df-4da2-b1fc-e1f473c96b67'
for gc 6.99999992410963days in the future
> DEBUG: I1126 10:07:41.006810 33863 gc.cpp:54] Scheduling '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/slaves/dab6e888-2e40-45e4-9595-051febe7dc16-S0/frameworks/dab6e888-2e40-45e4-9595-051febe7dc16-0000/executors/default'
for gc 6.99999992252148days in the future
> DEBUG: I1126 10:07:41.006824 33843 status_update_manager.cpp:282] Closing status update
streams for framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.006850 33863 gc.cpp:54] Scheduling '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/slaves/dab6e888-2e40-45e4-9595-051febe7dc16-S0/frameworks/dab6e888-2e40-45e4-9595-051febe7dc16-0000'
for gc 6.99999992122667days in the future
> DEBUG: I1126 10:07:41.007093 33823 sched.cpp:1803] Asked to stop the driver
> DEBUG: I1126 10:07:41.007164 33847 sched.cpp:1041] Stopping framework 'dab6e888-2e40-45e4-9595-051febe7dc16-0000'
> DEBUG: I1126 10:07:41.007185 33861 master.cpp:926] Master terminating
> DEBUG: I1126 10:07:41.007392 33845 hierarchical.cpp:410] Removed slave dab6e888-2e40-45e4-9595-051febe7dc16-S0
> DEBUG: I1126 10:07:41.007519 33854 hierarchical.cpp:260] Removed framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.007750 33863 slave.cpp:3243] master@a.b.c.d:33492 exited
> DEBUG: W1126 10:07:41.007767 33863 slave.cpp:3246] Master disconnected! Waiting for a
new master to be elected
> DEBUG: 2015-11-26 10:07:41,933:33823(0x7f5e2a7fc700):ZOO_ERROR@handle_socket_error_msg@1697:
Socket [127.0.0.1:57854] zk retcode=-4, errno=111(Connection refused): server refused to accept
the client
> DEBUG: I1126 10:07:41.959815 33844 leveldb.cpp:341] Persisting action (16 bytes) to leveldb
took 1.788383711secs
> DEBUG: I1126 10:07:41.959870 33844 replica.cpp:713] Persisted action at 4
> DEBUG: I1126 10:07:41.960736 33823 slave.cpp:599] Slave terminating
> DEBUG: [  FAILED  ] SlaveTest.MetricsSlaveLaunchErrors (11967 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message