mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Toenshoff (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-4025) SlaveRecoveryTest/0.GCExecutor is flaky.
Date Tue, 01 Dec 2015 01:09:10 GMT

    [ https://issues.apache.org/jira/browse/MESOS-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032840#comment-15032840 ] 

Till Toenshoff commented on MESOS-4025:
---------------------------------------

{noformat}
[ RUN      ] SlaveRecoveryTest/0.GCExecutor
I1201 01:07:05.936290 25504 leveldb.cpp:176] Opened db in 3.468541ms
I1201 01:07:05.937160 25504 leveldb.cpp:183] Compacted db in 805344ns
I1201 01:07:05.937222 25504 leveldb.cpp:198] Created db iterator in 15726ns
I1201 01:07:05.937234 25504 leveldb.cpp:204] Seeked to beginning of db in 1106ns
I1201 01:07:05.937242 25504 leveldb.cpp:273] Iterated through 0 keys in the db in 118ns
I1201 01:07:05.937283 25504 replica.cpp:780] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
I1201 01:07:05.937808 25522 recover.cpp:449] Starting replica recovery
I1201 01:07:05.938863 25524 master.cpp:367] Master 3a496a61-8184-4416-b323-27ac6436661d (ubuntu14) started on 127.0.1.1:58897
I1201 01:07:05.938920 25519 recover.cpp:475] Replica is in EMPTY status
I1201 01:07:05.938886 25524 master.cpp:369] Flags at startup: --acls="" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/uDNEII/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --quiet="false" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="25secs" --registry_strict="true" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/uDNEII/master" --zk_session_timeout="10secs"
I1201 01:07:05.939036 25524 master.cpp:414] Master only allowing authenticated frameworks to register
I1201 01:07:05.939045 25524 master.cpp:419] Master only allowing authenticated slaves to register
I1201 01:07:05.939062 25524 credentials.hpp:37] Loading credentials for authentication from '/tmp/uDNEII/credentials'
I1201 01:07:05.939237 25524 master.cpp:458] Using default 'crammd5' authenticator
I1201 01:07:05.939326 25524 master.cpp:495] Authorization enabled
I1201 01:07:05.939713 25521 replica.cpp:676] Replica in EMPTY status received a broadcasted recover request from (8245)@127.0.1.1:58897
I1201 01:07:05.940044 25522 recover.cpp:195] Received a recover response from a replica in EMPTY status
I1201 01:07:05.940450 25524 recover.cpp:566] Updating replica status to STARTING
I1201 01:07:05.940498 25521 master.cpp:1606] The newly elected leader is master@127.0.1.1:58897 with id 3a496a61-8184-4416-b323-27ac6436661d
I1201 01:07:05.940523 25521 master.cpp:1619] Elected as the leading master!
I1201 01:07:05.940537 25521 master.cpp:1379] Recovering from registrar
I1201 01:07:05.940670 25524 registrar.cpp:309] Recovering registrar
I1201 01:07:05.941493 25522 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 943836ns
I1201 01:07:05.941589 25522 replica.cpp:323] Persisted replica status to STARTING
I1201 01:07:05.941745 25522 recover.cpp:475] Replica is in STARTING status
I1201 01:07:05.942773 25522 replica.cpp:676] Replica in STARTING status received a broadcasted recover request from (8246)@127.0.1.1:58897
I1201 01:07:05.943061 25522 recover.cpp:195] Received a recover response from a replica in STARTING status
I1201 01:07:05.943423 25522 recover.cpp:566] Updating replica status to VOTING
I1201 01:07:05.943894 25522 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 333689ns
I1201 01:07:05.943923 25522 replica.cpp:323] Persisted replica status to VOTING
I1201 01:07:05.943987 25522 recover.cpp:580] Successfully joined the Paxos group
I1201 01:07:05.944084 25522 recover.cpp:464] Recover process terminated
I1201 01:07:05.944319 25519 log.cpp:661] Attempting to start the writer
I1201 01:07:05.945137 25519 replica.cpp:496] Replica received implicit promise request from (8247)@127.0.1.1:58897 with proposal 1
I1201 01:07:05.945484 25519 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 321958ns
I1201 01:07:05.945513 25519 replica.cpp:345] Persisted promised to 1
I1201 01:07:05.945940 25524 coordinator.cpp:240] Coordinator attempting to fill missing positions
I1201 01:07:05.947418 25522 replica.cpp:391] Replica received explicit promise request from (8248)@127.0.1.1:58897 for position 0 with proposal 2
I1201 01:07:05.947824 25522 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 307454ns
I1201 01:07:05.947880 25522 replica.cpp:715] Persisted action at 0
I1201 01:07:05.948781 25519 replica.cpp:540] Replica received write request for position 0 from (8249)@127.0.1.1:58897
I1201 01:07:05.948858 25519 leveldb.cpp:438] Reading position from leveldb took 22081ns
I1201 01:07:05.949193 25519 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 295134ns
I1201 01:07:05.949213 25519 replica.cpp:715] Persisted action at 0
I1201 01:07:05.949600 25519 replica.cpp:694] Replica received learned notice for position 0 from @0.0.0.0:0
I1201 01:07:05.949792 25519 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 158911ns
I1201 01:07:05.949810 25519 replica.cpp:715] Persisted action at 0
I1201 01:07:05.949833 25519 replica.cpp:700] Replica learned NOP action at position 0
I1201 01:07:05.950191 25521 log.cpp:677] Writer started with ending position 0
I1201 01:07:05.951280 25525 leveldb.cpp:438] Reading position from leveldb took 31887ns
I1201 01:07:05.952110 25525 registrar.cpp:342] Successfully fetched the registry (0B) in 11.401984ms
I1201 01:07:05.952211 25525 registrar.cpp:441] Applied 1 operations in 21359ns; attempting to update the 'registry'
I1201 01:07:05.952771 25521 log.cpp:685] Attempting to append 158 bytes to the log
I1201 01:07:05.952924 25525 coordinator.cpp:350] Coordinator attempting to write APPEND action at position 1
I1201 01:07:05.953634 25521 replica.cpp:540] Replica received write request for position 1 from (8250)@127.0.1.1:58897
I1201 01:07:05.954026 25521 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 329490ns
I1201 01:07:05.954068 25521 replica.cpp:715] Persisted action at 1
I1201 01:07:05.954428 25525 replica.cpp:694] Replica received learned notice for position 1 from @0.0.0.0:0
I1201 01:07:05.954756 25525 leveldb.cpp:343] Persisting action (179 bytes) to leveldb took 299525ns
I1201 01:07:05.954779 25525 replica.cpp:715] Persisted action at 1
I1201 01:07:05.954795 25525 replica.cpp:700] Replica learned APPEND action at position 1
I1201 01:07:05.955914 25518 registrar.cpp:486] Successfully updated the 'registry' in 3648us
I1201 01:07:05.956018 25518 registrar.cpp:372] Successfully recovered registrar
I1201 01:07:05.956092 25525 log.cpp:704] Attempting to truncate the log to 1
I1201 01:07:05.956226 25518 master.cpp:1416] Recovered 0 slaves from the Registry (120B) ; allowing 10mins for slaves to re-register
I1201 01:07:05.956228 25525 coordinator.cpp:350] Coordinator attempting to write TRUNCATE action at position 2
I1201 01:07:05.956943 25525 replica.cpp:540] Replica received write request for position 2 from (8251)@127.0.1.1:58897
I1201 01:07:05.957334 25525 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 356836ns
I1201 01:07:05.957366 25525 replica.cpp:715] Persisted action at 2
I1201 01:07:05.958392 25523 replica.cpp:694] Replica received learned notice for position 2 from @0.0.0.0:0
I1201 01:07:05.958724 25523 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 288626ns
I1201 01:07:05.958772 25523 leveldb.cpp:401] Deleting ~1 keys from leveldb took 17063ns
I1201 01:07:05.958786 25523 replica.cpp:715] Persisted action at 2
I1201 01:07:05.958822 25523 replica.cpp:700] Replica learned TRUNCATE action at position 2
I1201 01:07:05.970078 25504 containerizer.cpp:142] Using isolation: cgroups/cpu,cgroups/mem,filesystem/posix
I1201 01:07:05.992375 25504 linux_launcher.cpp:103] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I1201 01:07:05.994971 25524 slave.cpp:191] Slave started on 267)@127.0.1.1:58897
I1201 01:07:05.995038 25524 slave.cpp:192] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_auth_server="auth.docker.io" --docker_auth_server_port="443" --docker_kill_orphans="true" --docker_local_archives_dir="/tmp/mesos/images/docker" --docker_puller="local" --docker_puller_timeout="60" --docker_registry="registry-1.docker.io" --docker_registry_port="443" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="cgroups/cpu,cgroups/mem" --launcher_dir="/home/vagrant/mesos/build/src" --logbufsecs="0" --logging_level="INFO" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --slave_subsystems="memory,cpuacct" --strict="false" --switch_user="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ"
I1201 01:07:05.995272 25524 slave.cpp:212] Moving slave process into its own cgroup for subsystem: memory
I1201 01:07:05.995841 25504 sched.cpp:166] Version: 0.26.0
I1201 01:07:05.998517 25520 sched.cpp:264] New master detected at master@127.0.1.1:58897
I1201 01:07:05.998635 25520 sched.cpp:320] Authenticating with master master@127.0.1.1:58897
I1201 01:07:05.998652 25520 sched.cpp:327] Using default CRAM-MD5 authenticatee
I1201 01:07:05.998881 25518 authenticatee.cpp:123] Creating new client SASL connection
I1201 01:07:05.999177 25518 master.cpp:5150] Authenticating scheduler-75222ccd-d540-41e0-983d-8678316e31a0@127.0.1.1:58897
I1201 01:07:05.999387 25518 authenticator.cpp:100] Creating new server SASL connection
I1201 01:07:05.999598 25518 authenticatee.cpp:214] Received SASL authentication mechanisms: CRAM-MD5
I1201 01:07:05.999665 25518 authenticatee.cpp:240] Attempting to authenticate with mechanism 'CRAM-MD5'
I1201 01:07:05.999770 25518 authenticator.cpp:205] Received SASL authentication start
I1201 01:07:05.999845 25518 authenticator.cpp:327] Authentication requires more steps
I1201 01:07:06.000018 25518 authenticatee.cpp:260] Received SASL authentication step
I1201 01:07:06.000162 25518 authenticator.cpp:233] Received SASL authentication step
I1201 01:07:06.000222 25518 authenticator.cpp:319] Authentication success
I1201 01:07:06.000339 25520 authenticatee.cpp:300] Authentication success
I1201 01:07:06.000429 25518 master.cpp:5180] Successfully authenticated principal 'test-principal' at scheduler-75222ccd-d540-41e0-983d-8678316e31a0@127.0.1.1:58897
I1201 01:07:06.000737 25523 sched.cpp:409] Successfully authenticated with master master@127.0.1.1:58897
I1201 01:07:06.000964 25518 master.cpp:2176] Received SUBSCRIBE call for framework 'default' at scheduler-75222ccd-d540-41e0-983d-8678316e31a0@127.0.1.1:58897
I1201 01:07:06.001067 25518 master.cpp:1645] Authorizing framework principal 'test-principal' to receive offers for role '*'
I1201 01:07:06.001293 25518 master.cpp:2247] Subscribing framework default with checkpointing enabled and capabilities [  ]
I1201 01:07:06.001698 25523 hierarchical.cpp:195] Added framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.001839 25518 sched.cpp:643] Framework registered with 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.002789 25524 slave.cpp:212] Moving slave process into its own cgroup for subsystem: cpuacct
I1201 01:07:06.007205 25524 credentials.hpp:85] Loading credential for authentication from '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/credential'
I1201 01:07:06.007364 25524 slave.cpp:322] Slave using credential for: test-principal
I1201 01:07:06.007902 25524 slave.cpp:392] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 01:07:06.007968 25524 slave.cpp:400] Slave attributes: [  ]
I1201 01:07:06.007980 25524 slave.cpp:405] Slave hostname: ubuntu14
I1201 01:07:06.007987 25524 slave.cpp:410] Slave checkpoint: true
I1201 01:07:06.008828 25519 state.cpp:54] Recovering state from '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/meta'
I1201 01:07:06.009129 25519 status_update_manager.cpp:202] Recovering status update manager
I1201 01:07:06.009328 25519 containerizer.cpp:384] Recovering containerizer
I1201 01:07:06.011610 25525 slave.cpp:4230] Finished recovery
I1201 01:07:06.012310 25522 status_update_manager.cpp:176] Pausing sending status updates
I1201 01:07:06.012315 25519 slave.cpp:729] New master detected at master@127.0.1.1:58897
I1201 01:07:06.012414 25519 slave.cpp:792] Authenticating with master master@127.0.1.1:58897
I1201 01:07:06.012430 25519 slave.cpp:797] Using default CRAM-MD5 authenticatee
I1201 01:07:06.012531 25519 slave.cpp:765] Detecting new master
I1201 01:07:06.012629 25525 authenticatee.cpp:123] Creating new client SASL connection
I1201 01:07:06.012897 25521 master.cpp:5150] Authenticating slave(267)@127.0.1.1:58897
I1201 01:07:06.013226 25521 authenticator.cpp:100] Creating new server SASL connection
I1201 01:07:06.013406 25525 authenticatee.cpp:214] Received SASL authentication mechanisms: CRAM-MD5
I1201 01:07:06.013429 25525 authenticatee.cpp:240] Attempting to authenticate with mechanism 'CRAM-MD5'
I1201 01:07:06.013515 25525 authenticator.cpp:205] Received SASL authentication start
I1201 01:07:06.013550 25525 authenticator.cpp:327] Authentication requires more steps
I1201 01:07:06.013619 25525 authenticatee.cpp:260] Received SASL authentication step
I1201 01:07:06.013697 25525 authenticator.cpp:233] Received SASL authentication step
I1201 01:07:06.013751 25525 authenticator.cpp:319] Authentication success
I1201 01:07:06.013844 25520 authenticatee.cpp:300] Authentication success
I1201 01:07:06.013866 25525 master.cpp:5180] Successfully authenticated principal 'test-principal' at slave(267)@127.0.1.1:58897
I1201 01:07:06.014108 25520 slave.cpp:860] Successfully authenticated with master master@127.0.1.1:58897
I1201 01:07:06.014377 25520 master.cpp:3859] Registering slave at slave(267)@127.0.1.1:58897 (ubuntu14) with id 3a496a61-8184-4416-b323-27ac6436661d-S0
I1201 01:07:06.014771 25519 registrar.cpp:441] Applied 1 operations in 49016ns; attempting to update the 'registry'
I1201 01:07:06.015358 25525 log.cpp:685] Attempting to append 324 bytes to the log
I1201 01:07:06.015501 25521 coordinator.cpp:350] Coordinator attempting to write APPEND action at position 3
I1201 01:07:06.016134 25521 replica.cpp:540] Replica received write request for position 3 from (8270)@127.0.1.1:58897
I1201 01:07:06.016546 25521 leveldb.cpp:343] Persisting action (343 bytes) to leveldb took 345151ns
I1201 01:07:06.016590 25521 replica.cpp:715] Persisted action at 3
I1201 01:07:06.017155 25521 replica.cpp:694] Replica received learned notice for position 3 from @0.0.0.0:0
I1201 01:07:06.017601 25521 leveldb.cpp:343] Persisting action (345 bytes) to leveldb took 336912ns
I1201 01:07:06.017657 25521 replica.cpp:715] Persisted action at 3
I1201 01:07:06.017674 25521 replica.cpp:700] Replica learned APPEND action at position 3
I1201 01:07:06.018782 25521 registrar.cpp:486] Successfully updated the 'registry' in 3.917824ms
I1201 01:07:06.019404 25521 master.cpp:3927] Registered slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(267)@127.0.1.1:58897 (ubuntu14) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 01:07:06.019717 25521 slave.cpp:904] Registered with master master@127.0.1.1:58897; given slave ID 3a496a61-8184-4416-b323-27ac6436661d-S0
I1201 01:07:06.020117 25521 slave.cpp:963] Forwarding total oversubscribed resources
I1201 01:07:06.018857 25520 log.cpp:704] Attempting to truncate the log to 3
I1201 01:07:06.020340 25520 status_update_manager.cpp:183] Resuming sending status updates
I1201 01:07:06.020406 25520 coordinator.cpp:350] Coordinator attempting to write TRUNCATE action at position 4
I1201 01:07:06.021005 25520 replica.cpp:540] Replica received write request for position 4 from (8271)@127.0.1.1:58897
I1201 01:07:06.019610 25525 hierarchical.cpp:344] Added slave 3a496a61-8184-4416-b323-27ac6436661d-S0 (ubuntu14) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated: )
I1201 01:07:06.021602 25520 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 541557ns
I1201 01:07:06.021647 25520 replica.cpp:715] Persisted action at 4
I1201 01:07:06.022265 25520 replica.cpp:694] Replica received learned notice for position 4 from @0.0.0.0:0
I1201 01:07:06.022704 25525 master.cpp:4979] Sending 1 offers to framework 3a496a61-8184-4416-b323-27ac6436661d-0000 (default) at scheduler-75222ccd-d540-41e0-983d-8678316e31a0@127.0.1.1:58897
I1201 01:07:06.023301 25520 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 539060ns
I1201 01:07:06.023371 25520 leveldb.cpp:401] Deleting ~2 keys from leveldb took 26365ns
I1201 01:07:06.023387 25520 replica.cpp:715] Persisted action at 4
I1201 01:07:06.023404 25520 replica.cpp:700] Replica learned TRUNCATE action at position 4
I1201 01:07:06.023967 25521 master.cpp:4269] Received update of slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(267)@127.0.1.1:58897 (ubuntu14) with total oversubscribed resources
I1201 01:07:06.024179 25521 hierarchical.cpp:400] Slave 3a496a61-8184-4416-b323-27ac6436661d-S0 (ubuntu14) updated with oversubscribed resources  (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000])
I1201 01:07:06.025017 25520 master.cpp:2915] Processing ACCEPT call for offers: [ 3a496a61-8184-4416-b323-27ac6436661d-O0 ] on slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(267)@127.0.1.1:58897 (ubuntu14) for framework 3a496a61-8184-4416-b323-27ac6436661d-0000 (default) at scheduler-75222ccd-d540-41e0-983d-8678316e31a0@127.0.1.1:58897
I1201 01:07:06.025090 25520 master.cpp:2711] Authorizing framework principal 'test-principal' to launch task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 as user 'root'
I1201 01:07:06.025892 25520 master.hpp:176] Adding task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 3a496a61-8184-4416-b323-27ac6436661d-S0 (ubuntu14)
I1201 01:07:06.026010 25520 master.cpp:3245] Launching task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 (default) at scheduler-75222ccd-d540-41e0-983d-8678316e31a0@127.0.1.1:58897 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(267)@127.0.1.1:58897 (ubuntu14)
I1201 01:07:06.026288 25518 slave.cpp:1294] Got assigned task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 for framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.027089 25518 slave.cpp:1410] Launching task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 for framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.027439 25518 paths.cpp:436] Trying to chown '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000/executors/c4bb2548-f50c-457f-bd11-2bc6f25e3bb1/runs/c321210a-f09a-4a8f-a629-cc573ebb37ea' to user 'root'
I1201 01:07:06.031638 25518 slave.cpp:4999] Launching executor c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000/executors/c4bb2548-f50c-457f-bd11-2bc6f25e3bb1/runs/c321210a-f09a-4a8f-a629-cc573ebb37ea'
I1201 01:07:06.032013 25521 containerizer.cpp:618] Starting container 'c321210a-f09a-4a8f-a629-cc573ebb37ea' for executor 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' of framework '3a496a61-8184-4416-b323-27ac6436661d-0000'
I1201 01:07:06.032254 25518 slave.cpp:1628] Queuing task 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' for executor 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.035073 25523 cpushare.cpp:392] Updated 'cpu.shares' to 2150 (cpus 2.1) for container c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.036918 25519 mem.cpp:605] Started listening for OOM events for container c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.037574 25519 mem.cpp:725] Started listening on low memory pressure events for container c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.038344 25519 mem.cpp:725] Started listening on medium memory pressure events for container c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.038887 25519 mem.cpp:725] Started listening on critical memory pressure events for container c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.039404 25519 mem.cpp:356] Updated 'memory.soft_limit_in_bytes' to 1056MB for container c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.040118 25519 mem.cpp:391] Updated 'memory.limit_in_bytes' to 1056MB for container c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.041671 25523 linux_launcher.cpp:365] Cloning child process with flags =
I1201 01:07:06.043869 25523 containerizer.cpp:851] Checkpointing executor's forked pid 31661 to '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/meta/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000/executors/c4bb2548-f50c-457f-bd11-2bc6f25e3bb1/runs/c321210a-f09a-4a8f-a629-cc573ebb37ea/pids/forked.pid'
I1201 01:07:06.122028 31675 exec.cpp:136] Version: 0.26.0
I1201 01:07:06.126552 25518 slave.cpp:2405] Got registration for executor 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 from executor(1)@127.0.1.1:44259
I1201 01:07:06.129003 25518 cpushare.cpp:392] Updated 'cpu.shares' to 2150 (cpus 2.1) for container c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.129073 25520 mem.cpp:356] Updated 'memory.soft_limit_in_bytes' to 1056MB for container c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.129940 25518 slave.cpp:1793] Sending queued task 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' to executor 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 at executor(1)@127.0.1.1:44259
I1201 01:07:06.130918 31694 exec.cpp:210] Executor registered on slave 3a496a61-8184-4416-b323-27ac6436661d-S0
Registered executor on ubuntu14
Starting task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1
Forked command at 31698
sh -c 'sleep 1000'
I1201 01:07:06.137362 25520 slave.cpp:2762] Handling status update TASK_RUNNING (UUID: b7c553d7-8243-4f8f-8589-6b86621d02ac) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 from executor(1)@127.0.1.1:44259
I1201 01:07:06.137683 25524 status_update_manager.cpp:322] Received status update TASK_RUNNING (UUID: b7c553d7-8243-4f8f-8589-6b86621d02ac) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.138051 25524 status_update_manager.cpp:826] Checkpointing UPDATE for status update TASK_RUNNING (UUID: b7c553d7-8243-4f8f-8589-6b86621d02ac) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.139837 25520 slave.cpp:3087] Forwarding the update TASK_RUNNING (UUID: b7c553d7-8243-4f8f-8589-6b86621d02ac) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 to master@127.0.1.1:58897
I1201 01:07:06.139974 25520 slave.cpp:3011] Sending acknowledgement for status update TASK_RUNNING (UUID: b7c553d7-8243-4f8f-8589-6b86621d02ac) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 to executor(1)@127.0.1.1:44259
I1201 01:07:06.140108 25524 master.cpp:4414] Status update TASK_RUNNING (UUID: b7c553d7-8243-4f8f-8589-6b86621d02ac) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 from slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(267)@127.0.1.1:58897 (ubuntu14)
I1201 01:07:06.140136 25524 master.cpp:4462] Forwarding status update TASK_RUNNING (UUID: b7c553d7-8243-4f8f-8589-6b86621d02ac) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.140243 25524 master.cpp:6066] Updating the state of task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 (latest state: TASK_RUNNING, status update state: TASK_RUNNING)
I1201 01:07:06.140568 25524 master.cpp:3571] Processing ACKNOWLEDGE call b7c553d7-8243-4f8f-8589-6b86621d02ac for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 (default) at scheduler-75222ccd-d540-41e0-983d-8678316e31a0@127.0.1.1:58897 on slave 3a496a61-8184-4416-b323-27ac6436661d-S0
I1201 01:07:06.140811 25524 status_update_manager.cpp:394] Received status update acknowledgement (UUID: b7c553d7-8243-4f8f-8589-6b86621d02ac) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.140889 25524 status_update_manager.cpp:826] Checkpointing ACK for status update TASK_RUNNING (UUID: b7c553d7-8243-4f8f-8589-6b86621d02ac) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.141285 25504 slave.cpp:601] Slave terminating
I1201 01:07:06.141489 25520 master.cpp:1083] Slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(267)@127.0.1.1:58897 (ubuntu14) disconnected
I1201 01:07:06.141517 25520 master.cpp:2531] Disconnecting slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(267)@127.0.1.1:58897 (ubuntu14)
I1201 01:07:06.141561 25520 master.cpp:2550] Deactivating slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(267)@127.0.1.1:58897 (ubuntu14)
I1201 01:07:06.141660 25525 hierarchical.cpp:429] Slave 3a496a61-8184-4416-b323-27ac6436661d-S0 deactivated
I1201 01:07:06.145009 25522 containerizer.cpp:1074] Destroying container 'c321210a-f09a-4a8f-a629-cc573ebb37ea'
I1201 01:07:06.146512 25520 cgroups.cpp:2429] Freezing cgroup /sys/fs/cgroup/freezer/mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00/c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.148052 25522 cgroups.cpp:1411] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00/c321210a-f09a-4a8f-a629-cc573ebb37ea after 1.395712ms
I1201 01:07:06.149593 25520 cgroups.cpp:2447] Thawing cgroup /sys/fs/cgroup/freezer/mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00/c321210a-f09a-4a8f-a629-cc573ebb37ea
I1201 01:07:06.151010 25518 cgroups.cpp:1440] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00/c321210a-f09a-4a8f-a629-cc573ebb37ea after 1.36576ms
I1201 01:07:06.231267 25525 containerizer.cpp:1257] Executor for container 'c321210a-f09a-4a8f-a629-cc573ebb37ea' has exited
I1201 01:07:06.243788 25504 containerizer.cpp:142] Using isolation: cgroups/cpu,cgroups/mem,filesystem/posix
I1201 01:07:06.261788 25504 linux_launcher.cpp:103] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I1201 01:07:06.264866 25523 slave.cpp:191] Slave started on 268)@127.0.1.1:58897
I1201 01:07:06.265123 25523 slave.cpp:192] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_auth_server="auth.docker.io" --docker_auth_server_port="443" --docker_kill_orphans="true" --docker_local_archives_dir="/tmp/mesos/images/docker" --docker_puller="local" --docker_puller_timeout="60" --docker_registry="registry-1.docker.io" --docker_registry_port="443" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="cgroups/cpu,cgroups/mem" --launcher_dir="/home/vagrant/mesos/build/src" --logbufsecs="0" --logging_level="INFO" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --slave_subsystems="memory,cpuacct" --strict="false" --switch_user="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ"
I1201 01:07:06.265451 25523 slave.cpp:212] Moving slave process into its own cgroup for subsystem: memory
I1201 01:07:06.272876 25523 slave.cpp:212] Moving slave process into its own cgroup for subsystem: cpuacct
I1201 01:07:06.277359 25523 credentials.hpp:85] Loading credential for authentication from '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/credential'
I1201 01:07:06.277580 25523 slave.cpp:322] Slave using credential for: test-principal
I1201 01:07:06.278264 25523 slave.cpp:392] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 01:07:06.278367 25523 slave.cpp:400] Slave attributes: [  ]
I1201 01:07:06.278379 25523 slave.cpp:405] Slave hostname: ubuntu14
I1201 01:07:06.278386 25523 slave.cpp:410] Slave checkpoint: true
I1201 01:07:06.279284 25522 state.cpp:54] Recovering state from '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/meta'
I1201 01:07:06.279335 25522 state.cpp:681] No checkpointed resources found at '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/meta/resources/resources.info'
W1201 01:07:06.280480 25522 state.cpp:365] Failed to find the latest run of executor 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.281354 25523 slave.cpp:4318] Recovering framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.281460 25523 slave.cpp:5108] Recovering executor 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
W1201 01:07:06.281484 25523 slave.cpp:5114] Skipping recovery of executor 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 because its latest run or executor info cannot be recovered
I1201 01:07:06.281853 25522 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000/executors/c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' for gc 6.99999693017185days in the future
I1201 01:07:06.281862 25523 slave.cpp:3745] Cleaning up framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.281986 25522 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/meta/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000/executors/c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' for gc 6.99999693017185days in the future
I1201 01:07:06.282161 25522 status_update_manager.cpp:284] Closing status update streams for framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.282258 25522 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000' for gc 6.99999693017185days in the future
I1201 01:07:06.282352 25522 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/meta/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000' for gc 6.99999693017185days in the future
I1201 01:07:06.282706 25522 status_update_manager.cpp:202] Recovering status update manager
I1201 01:07:06.282783 25522 status_update_manager.cpp:210] Recovering executor 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
W1201 01:07:06.282804 25522 status_update_manager.cpp:214] Skipping recovering updates of executor 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 because its info cannot be recovered
I1201 01:07:06.283439 25525 containerizer.cpp:384] Recovering containerizer
W1201 01:07:06.283484 25525 containerizer.cpp:392] Skipping recovery of executor 'c4bb2548-f50c-457f-bd11-2bc6f25e3bb1' of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 because its info could not be recovered
I1201 01:07:06.285965 25521 slave.cpp:4230] Finished recovery
I1201 01:07:06.286639 25520 slave.cpp:729] New master detected at master@127.0.1.1:58897
I1201 01:07:06.286674 25524 status_update_manager.cpp:176] Pausing sending status updates
I1201 01:07:06.286712 25520 slave.cpp:792] Authenticating with master master@127.0.1.1:58897
I1201 01:07:06.286727 25520 slave.cpp:797] Using default CRAM-MD5 authenticatee
I1201 01:07:06.286902 25520 slave.cpp:765] Detecting new master
I1201 01:07:06.287261 25521 authenticatee.cpp:123] Creating new client SASL connection
I1201 01:07:06.287636 25523 master.cpp:5150] Authenticating slave(268)@127.0.1.1:58897
I1201 01:07:06.288003 25525 authenticator.cpp:100] Creating new server SASL connection
I1201 01:07:06.288234 25521 authenticatee.cpp:214] Received SASL authentication mechanisms: CRAM-MD5
I1201 01:07:06.288270 25521 authenticatee.cpp:240] Attempting to authenticate with mechanism 'CRAM-MD5'
I1201 01:07:06.288367 25521 authenticator.cpp:205] Received SASL authentication start
I1201 01:07:06.288430 25521 authenticator.cpp:327] Authentication requires more steps
I1201 01:07:06.288535 25518 authenticatee.cpp:260] Received SASL authentication step
I1201 01:07:06.288660 25518 authenticator.cpp:233] Received SASL authentication step
I1201 01:07:06.288763 25518 authenticator.cpp:319] Authentication success
I1201 01:07:06.288915 25520 authenticatee.cpp:300] Authentication success
I1201 01:07:06.289007 25518 master.cpp:5180] Successfully authenticated principal 'test-principal' at slave(268)@127.0.1.1:58897
I1201 01:07:06.289391 25519 slave.cpp:860] Successfully authenticated with master master@127.0.1.1:58897
I1201 01:07:06.290217 25522 master.cpp:4019] Re-registering slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(267)@127.0.1.1:58897 (ubuntu14)
W1201 01:07:06.290510 25522 master.cpp:5247] Task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 unknown to the slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(268)@127.0.1.1:58897 (ubuntu14) during re-registration : reconciling with the slave
I1201 01:07:06.290843 25522 master.cpp:4207] Sending updated checkpointed resources  to slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(268)@127.0.1.1:58897 (ubuntu14)
I1201 01:07:06.291048 25524 hierarchical.cpp:417] Slave 3a496a61-8184-4416-b323-27ac6436661d-S0 reactivated
I1201 01:07:06.291103 25519 slave.cpp:1004] Re-registered with master master@127.0.1.1:58897
I1201 01:07:06.291216 25519 slave.cpp:1040] Forwarding total oversubscribed resources
I1201 01:07:06.291226 25525 status_update_manager.cpp:183] Resuming sending status updates
W1201 01:07:06.291402 25519 slave.cpp:1080] Slave reconciling task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 in state TASK_LOST: task unknown to the slave
I1201 01:07:06.291514 25521 master.cpp:4269] Received update of slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(268)@127.0.1.1:58897 (ubuntu14) with total oversubscribed resources
I1201 01:07:06.291617 25521 status_update_manager.cpp:322] Received status update TASK_LOST (UUID: 3bba3cb9-9541-4d63-bfc4-9eabc7574d1d) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.291826 25520 hierarchical.cpp:400] Slave 3a496a61-8184-4416-b323-27ac6436661d-S0 (ubuntu14) updated with oversubscribed resources  (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000])
I1201 01:07:06.292337 25519 slave.cpp:2274] Updated checkpointed resources from  to
I1201 01:07:06.292418 25519 slave.cpp:3087] Forwarding the update TASK_LOST (UUID: 3bba3cb9-9541-4d63-bfc4-9eabc7574d1d) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 to master@127.0.1.1:58897
I1201 01:07:06.292635 25525 master.cpp:4414] Status update TASK_LOST (UUID: 3bba3cb9-9541-4d63-bfc4-9eabc7574d1d) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 from slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(268)@127.0.1.1:58897 (ubuntu14)
I1201 01:07:06.292670 25525 master.cpp:4462] Forwarding status update TASK_LOST (UUID: 3bba3cb9-9541-4d63-bfc4-9eabc7574d1d) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.292783 25525 master.cpp:6066] Updating the state of task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 (latest state: TASK_LOST, status update state: TASK_LOST)
I1201 01:07:06.293066 25521 hierarchical.cpp:744] Recovered cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: ) on slave 3a496a61-8184-4416-b323-27ac6436661d-S0 from framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.293133 25525 master.cpp:3571] Processing ACKNOWLEDGE call 3bba3cb9-9541-4d63-bfc4-9eabc7574d1d for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 (default) at scheduler-75222ccd-d540-41e0-983d-8678316e31a0@127.0.1.1:58897 on slave 3a496a61-8184-4416-b323-27ac6436661d-S0
I1201 01:07:06.293171 25525 master.cpp:6132] Removing task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] of framework 3a496a61-8184-4416-b323-27ac6436661d-0000 on slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(268)@127.0.1.1:58897 (ubuntu14)
I1201 01:07:06.293540 25525 status_update_manager.cpp:394] Received status update acknowledgement (UUID: 3bba3cb9-9541-4d63-bfc4-9eabc7574d1d) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of framework 3a496a61-8184-4416-b323-27ac6436661d-0000
E1201 01:07:06.293838 25525 slave.cpp:2356] Status update acknowledgement (UUID: 3bba3cb9-9541-4d63-bfc4-9eabc7574d1d) for task c4bb2548-f50c-457f-bd11-2bc6f25e3bb1 of unknown framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.312912 25522 master.cpp:4979] Sending 1 offers to framework 3a496a61-8184-4416-b323-27ac6436661d-0000 (default) at scheduler-75222ccd-d540-41e0-983d-8678316e31a0@127.0.1.1:58897
I1201 01:07:06.324736 25519 slave.cpp:3205] No pings from master received within 75secs
I1201 01:07:06.328652 25519 slave.cpp:4039] Current disk usage 14.37%. Max allowed age: 5.294415027731979days
I1201 01:07:06.328793 25524 gc.cpp:134] Deleting /tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000/executors/c4bb2548-f50c-457f-bd11-2bc6f25e3bb1
I1201 01:07:06.330178 25524 gc.cpp:143] Deleted '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000/executors/c4bb2548-f50c-457f-bd11-2bc6f25e3bb1'
I1201 01:07:06.330246 25524 gc.cpp:134] Deleting /tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/meta/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000/executors/c4bb2548-f50c-457f-bd11-2bc6f25e3bb1
I1201 01:07:06.330528 25524 gc.cpp:143] Deleted '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/meta/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000/executors/c4bb2548-f50c-457f-bd11-2bc6f25e3bb1'
I1201 01:07:06.330579 25524 gc.cpp:134] Deleting /tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.330656 25524 gc.cpp:143] Deleted '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000'
I1201 01:07:06.330700 25524 gc.cpp:134] Deleting /tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/meta/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.330809 25524 gc.cpp:143] Deleted '/tmp/SlaveRecoveryTest_0_GCExecutor_vAacaZ/meta/slaves/3a496a61-8184-4416-b323-27ac6436661d-S0/frameworks/3a496a61-8184-4416-b323-27ac6436661d-0000'
I1201 01:07:06.329983 25518 status_update_manager.cpp:176] Pausing sending status updates
I1201 01:07:06.329526 25519 slave.cpp:718] Re-detecting master
I1201 01:07:06.331493 25519 slave.cpp:765] Detecting new master
I1201 01:07:06.331959 25522 slave.cpp:729] New master detected at master@127.0.1.1:58897
I1201 01:07:06.331989 25525 status_update_manager.cpp:176] Pausing sending status updates
I1201 01:07:06.332085 25522 slave.cpp:792] Authenticating with master master@127.0.1.1:58897
I1201 01:07:06.332110 25522 slave.cpp:797] Using default CRAM-MD5 authenticatee
I1201 01:07:06.332353 25522 slave.cpp:765] Detecting new master
I1201 01:07:06.332403 25521 authenticatee.cpp:123] Creating new client SASL connection
I1201 01:07:06.332792 25522 master.cpp:5150] Authenticating slave(268)@127.0.1.1:58897
I1201 01:07:06.333169 25522 authenticator.cpp:100] Creating new server SASL connection
I1201 01:07:06.333343 25521 authenticatee.cpp:214] Received SASL authentication mechanisms: CRAM-MD5
I1201 01:07:06.333386 25521 authenticatee.cpp:240] Attempting to authenticate with mechanism 'CRAM-MD5'
I1201 01:07:06.333477 25521 authenticator.cpp:205] Received SASL authentication start
I1201 01:07:06.333559 25521 authenticator.cpp:327] Authentication requires more steps
I1201 01:07:06.333680 25521 authenticatee.cpp:260] Received SASL authentication step
I1201 01:07:06.333797 25521 authenticator.cpp:233] Received SASL authentication step
I1201 01:07:06.333899 25521 authenticator.cpp:319] Authentication success
I1201 01:07:06.334036 25522 authenticatee.cpp:300] Authentication success
I1201 01:07:06.334060 25518 master.cpp:5180] Successfully authenticated principal 'test-principal' at slave(268)@127.0.1.1:58897
I1201 01:07:06.334429 25522 slave.cpp:860] Successfully authenticated with master master@127.0.1.1:58897
I1201 01:07:06.334842 25522 master.cpp:4019] Re-registering slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(268)@127.0.1.1:58897 (ubuntu14)
I1201 01:07:06.335049 25522 master.cpp:4207] Sending updated checkpointed resources  to slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(268)@127.0.1.1:58897 (ubuntu14)
I1201 01:07:06.335203 25520 slave.cpp:1004] Re-registered with master master@127.0.1.1:58897
I1201 01:07:06.335391 25520 slave.cpp:1040] Forwarding total oversubscribed resources
I1201 01:07:06.335536 25520 slave.cpp:2274] Updated checkpointed resources from  to
I1201 01:07:06.335572 25524 status_update_manager.cpp:183] Resuming sending status updates
I1201 01:07:06.335633 25519 master.cpp:4269] Received update of slave 3a496a61-8184-4416-b323-27ac6436661d-S0 at slave(268)@127.0.1.1:58897 (ubuntu14) with total oversubscribed resources
I1201 01:07:06.335999 25519 hierarchical.cpp:400] Slave 3a496a61-8184-4416-b323-27ac6436661d-S0 (ubuntu14) updated with oversubscribed resources  (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000])
I1201 01:07:06.351939 25504 sched.cpp:1805] Asked to stop the driver
I1201 01:07:06.352109 25518 master.cpp:922] Master terminating
I1201 01:07:06.352109 25525 sched.cpp:1043] Stopping framework '3a496a61-8184-4416-b323-27ac6436661d-0000'
I1201 01:07:06.352488 25525 hierarchical.cpp:373] Removed slave 3a496a61-8184-4416-b323-27ac6436661d-S0
I1201 01:07:06.352962 25518 hierarchical.cpp:230] Removed framework 3a496a61-8184-4416-b323-27ac6436661d-0000
I1201 01:07:06.353031 25521 slave.cpp:3215] master@127.0.1.1:58897 exited
W1201 01:07:06.353288 25521 slave.cpp:3218] Master disconnected! Waiting for a new master to be elected
I1201 01:07:06.355612 25504 slave.cpp:601] Slave terminating
I1201 01:07:06.367249 25522 cgroups.cpp:2429] Freezing cgroup /sys/fs/cgroup/freezer/mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00
I1201 01:07:06.368901 25524 cgroups.cpp:1411] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00 after 1.32992ms
I1201 01:07:06.370445 25522 cgroups.cpp:2447] Thawing cgroup /sys/fs/cgroup/freezer/mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00
I1201 01:07:06.371871 25524 cgroups.cpp:1440] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00 after 1.135104ms
../../src/tests/mesos.cpp:781: Failure
(cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup '/sys/fs/cgroup/memory/mesos_test_9d99eac1-69d0-402a-a34c-e35697741a00/slave': Device or resource busy
*** Aborted at 1448932026 (unix time) try "date -d @1448932026" if you are using GNU date ***
PC: @          0x1443ea6 testing::UnitTest::AddTestPartResult()
*** SIGSEGV (@0x0) received by PID 25504 (TID 0x7f155c917800) from PID 0; stack trace: ***
    @     0x7f153b9c30b7 os::Linux::chained_handler()
    @     0x7f153b9c7219 JVM_handle_linux_signal
    @     0x7f15567a8340 (unknown)
    @          0x1443ea6 testing::UnitTest::AddTestPartResult()
    @          0x1438ba5 testing::internal::AssertHelper::operator=()
    @           0xf0b3c1 mesos::internal::tests::ContainerizerTest<>::TearDown()
    @          0x146188e testing::internal::HandleSehExceptionsInMethodIfSupported<>()
    @          0x145c704 testing::internal::HandleExceptionsInMethodIfSupported<>()
    @          0x143de56 testing::Test::Run()
    @          0x143e590 testing::TestInfo::Run()
    @          0x143ebd6 testing::TestCase::Run()
    @          0x144531e testing::internal::UnitTestImpl::RunAllTests()
    @          0x14624b3 testing::internal::HandleSehExceptionsInMethodIfSupported<>()
    @          0x145d27a testing::internal::HandleExceptionsInMethodIfSupported<>()
    @          0x14440ba testing::UnitTest::Run()
    @           0xd15cd4 RUN_ALL_TESTS()
    @           0xd158c1 main
    @     0x7f15563f4ec5 (unknown)
    @           0x913009 (unknown)
{noformat}

> SlaveRecoveryTest/0.GCExecutor is flaky.
> ----------------------------------------
>
>                 Key: MESOS-4025
>                 URL: https://issues.apache.org/jira/browse/MESOS-4025
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.26.0
>            Reporter: Till Toenshoff
>
> Build was SSL enabled (--enable-ssl, --enable-libevent). The build was based on 0.26.0-rc1.
> Testsuite was run as root.
> {noformat}
> sudo ./bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=-1
> {noformat}
> {noformat}
> [ RUN      ] SlaveRecoveryTest/0.GCExecutor
> I1130 16:49:16.336833  1032 exec.cpp:136] Version: 0.26.0
> I1130 16:49:16.345212  1049 exec.cpp:210] Executor registered on slave dde9fd4e-b016-4a99-9081-b047e9df9afa-S0
> Registered executor on ubuntu14
> Starting task 22c63bba-cbf8-46fd-b23a-5409d69e4114
> sh -c 'sleep 1000'
> Forked command at 1057
> ../../src/tests/mesos.cpp:779: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup '/sys/fs/cgroup/memory/mesos_test_e5edb2a8-9af3-441f-b991-613082f264e2/slave': Device or resource busy
> *** Aborted at 1448902156 (unix time) try "date -d @1448902156" if you are using GNU date ***
> PC: @          0x1443e9a testing::UnitTest::AddTestPartResult()
> *** SIGSEGV (@0x0) received by PID 27364 (TID 0x7f1bfdd2b800) from PID 0; stack trace: ***
>     @     0x7f1be92b80b7 os::Linux::chained_handler()
>     @     0x7f1be92bc219 JVM_handle_linux_signal
>     @     0x7f1bf7bbc340 (unknown)
>     @          0x1443e9a testing::UnitTest::AddTestPartResult()
>     @          0x1438b99 testing::internal::AssertHelper::operator=()
>     @           0xf0b3bb mesos::internal::tests::ContainerizerTest<>::TearDown()
>     @          0x1461882 testing::internal::HandleSehExceptionsInMethodIfSupported<>()
>     @          0x145c6f8 testing::internal::HandleExceptionsInMethodIfSupported<>()
>     @          0x143de4a testing::Test::Run()
>     @          0x143e584 testing::TestInfo::Run()
>     @          0x143ebca testing::TestCase::Run()
>     @          0x1445312 testing::internal::UnitTestImpl::RunAllTests()
>     @          0x14624a7 testing::internal::HandleSehExceptionsInMethodIfSupported<>()
>     @          0x145d26e testing::internal::HandleExceptionsInMethodIfSupported<>()
>     @          0x14440ae testing::UnitTest::Run()
>     @           0xd15cd4 RUN_ALL_TESTS()
>     @           0xd158c1 main
>     @     0x7f1bf7808ec5 (unknown)
>     @           0x913009 (unknown)
> {noformat}
> My Vagrantfile generator;
> {noformat}
> #!/usr/bin/env bash
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.box = "bento/ubuntu-14.04"
>   config.vm.hostname = "${PLATFORM_NAME}"
>   config.vm.provider "virtualbox" do |vb|
>     vb.memory = ${VAGRANT_MEM}
>     vb.cpus = ${VAGRANT_CPUS}
>     vb.customize ["modifyvm", :id, "--nictype1", "virtio"]
>     vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
>     vb.customize ["modifyvm", :id, "--natdnsproxy1", "on"]
>   end
>   config.vm.provider "vmware_fusion" do |vb|
>     vb.memory = ${VAGRANT_MEM}
>     vb.cpus = ${VAGRANT_CPUS}
>   end
>   config.vm.provision "file", source: "../test.sh", destination: "~/test.sh"
>   config.vm.provision "shell", inline: <<-SHELL
>     sudo apt-get update
>     sudo apt-get -y install openjdk-7-jdk autoconf libtool
>     sudo apt-get -y install build-essential python-dev python-boto          \
>                             libcurl4-nss-dev libsasl2-dev maven             \
>                             libapr1-dev libsvn-dev libssl-dev libevent-dev
>     sudo apt-get -y install git
>     sudo wget -qO- https://get.docker.com/ | sh
>   SHELL
> end
> EOF
> {noformat}
> The problem is kicking in frequently in my tests - I'ld say > 10% but less than 50%.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message