mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yan Xu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MESOS-1437) SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch is flaky
Date Thu, 29 May 2014 18:29:02 GMT
Yan Xu created MESOS-1437:
-----------------------------

             Summary: SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch is flaky
                 Key: MESOS-1437
                 URL: https://issues.apache.org/jira/browse/MESOS-1437
             Project: Mesos
          Issue Type: Bug
          Components: test
    Affects Versions: 0.19.0
         Environment: ubuntu-14.04-gcc
            Reporter: Yan Xu



{noformat}
[ RUN      ] SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch
Using temporary directory '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_lUbOAD'
I0529 09:29:24.491727 18175 leveldb.cpp:176] Opened db in 3.742412ms
I0529 09:29:24.493413 18175 leveldb.cpp:183] Compacted db in 1.183942ms
I0529 09:29:24.493865 18175 leveldb.cpp:198] Created db iterator in 6242ns
I0529 09:29:24.494138 18175 leveldb.cpp:204] Seeked to beginning of db in 1213ns
I0529 09:29:24.494526 18175 leveldb.cpp:273] Iterated through 0 keys in the db in 638ns
I0529 09:29:24.494839 18175 replica.cpp:741] Replica recovered with log positions 0 ->
0 with 1 holes and 0 unlearned
I0529 09:29:24.495407 18197 recover.cpp:425] Starting replica recovery
I0529 09:29:24.495491 18197 recover.cpp:451] Replica is in EMPTY status
I0529 09:29:24.495844 18197 replica.cpp:638] Replica in EMPTY status received a broadcasted
recover request
I0529 09:29:24.495916 18197 recover.cpp:188] Received a recover response from a replica in
EMPTY status
I0529 09:29:24.496022 18197 recover.cpp:542] Updating replica status to STARTING
I0529 09:29:24.497161 18194 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took
844433ns
I0529 09:29:24.497185 18194 replica.cpp:320] Persisted replica status to STARTING
I0529 09:29:24.497251 18194 recover.cpp:451] Replica is in STARTING status
I0529 09:29:24.497478 18194 replica.cpp:638] Replica in STARTING status received a broadcasted
recover request
I0529 09:29:24.497534 18194 recover.cpp:188] Received a recover response from a replica in
STARTING status
I0529 09:29:24.497622 18194 recover.cpp:542] Updating replica status to VOTING
I0529 09:29:24.498047 18194 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took
376402ns
I0529 09:29:24.498067 18194 replica.cpp:320] Persisted replica status to VOTING
I0529 09:29:24.498108 18194 recover.cpp:556] Successfully joined the Paxos group
I0529 09:29:24.498159 18194 recover.cpp:440] Recover process terminated
I0529 09:29:24.503008 18196 master.cpp:272] Master 20140529-092924-16842879-48472-18175 (trusty)
started on 127.0.1.1:48472
I0529 09:29:24.503432 18196 master.cpp:309] Master only allowing authenticated frameworks
to register
I0529 09:29:24.503765 18196 master.cpp:314] Master only allowing authenticated slaves to register
I0529 09:29:24.504178 18196 credentials.hpp:35] Loading credentials for authentication from
'/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_lUbOAD/credentials'
I0529 09:29:24.504432 18196 master.cpp:340] Master enabling authorization
I0529 09:29:24.504879 18200 hierarchical_allocator_process.hpp:301] Initializing hierarchical
allocator process with master : master@127.0.1.1:48472
I0529 09:29:24.505163 18198 master.cpp:108] No whitelist given. Advertising offers for all
slaves
I0529 09:29:24.506216 18196 master.cpp:957] The newly elected leader is master@127.0.1.1:48472
with id 20140529-092924-16842879-48472-18175
I0529 09:29:24.506657 18196 master.cpp:970] Elected as the leading master!
I0529 09:29:24.506963 18196 master.cpp:788] Recovering from registrar
I0529 09:29:24.507419 18201 registrar.cpp:313] Recovering registrar
I0529 09:29:24.507659 18194 log.cpp:656] Attempting to start the writer
I0529 09:29:24.507969 18194 replica.cpp:474] Replica received implicit promise request with
proposal 1
I0529 09:29:24.508533 18194 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took
540486ns
I0529 09:29:24.508553 18194 replica.cpp:342] Persisted promised to 1
I0529 09:29:24.508718 18194 coordinator.cpp:230] Coordinator attemping to fill missing position
I0529 09:29:24.509006 18194 replica.cpp:375] Replica received explicit promise request for
position 0 with proposal 2
I0529 09:29:24.509305 18194 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 278308ns
I0529 09:29:24.509552 18194 replica.cpp:676] Persisted action at 0
I0529 09:29:24.510025 18194 replica.cpp:508] Replica received write request for position 0
I0529 09:29:24.510303 18194 leveldb.cpp:438] Reading position from leveldb took 16831ns
I0529 09:29:24.510577 18194 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took
252132ns
I0529 09:29:24.510615 18194 replica.cpp:676] Persisted action at 0
I0529 09:29:24.514835 18197 replica.cpp:655] Replica received learned notice for position
0
I0529 09:29:24.515504 18197 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took
639727ns
I0529 09:29:24.515524 18197 replica.cpp:676] Persisted action at 0
I0529 09:29:24.515534 18197 replica.cpp:661] Replica learned NOP action at position 0
I0529 09:29:24.515691 18197 log.cpp:672] Writer started with ending position 0
I0529 09:29:24.516027 18198 leveldb.cpp:438] Reading position from leveldb took 11426ns
I0529 09:29:24.517470 18198 registrar.cpp:346] Successfully fetched the registry (0B)
I0529 09:29:24.517500 18198 registrar.cpp:422] Attempting to update the 'registry'
I0529 09:29:24.519122 18199 log.cpp:680] Attempting to append 119 bytes to the log
I0529 09:29:24.519177 18199 coordinator.cpp:340] Coordinator attempting to write APPEND action
at position 1
I0529 09:29:24.519358 18199 replica.cpp:508] Replica received write request for position 1
I0529 09:29:24.520508 18199 leveldb.cpp:343] Persisting action (136 bytes) to leveldb took
1.12638ms
I0529 09:29:24.520544 18199 replica.cpp:676] Persisted action at 1
I0529 09:29:24.520712 18199 replica.cpp:655] Replica received learned notice for position
1
I0529 09:29:24.521198 18199 leveldb.cpp:343] Persisting action (138 bytes) to leveldb took
464062ns
I0529 09:29:24.521219 18199 replica.cpp:676] Persisted action at 1
I0529 09:29:24.521229 18199 replica.cpp:661] Replica learned APPEND action at position 1
I0529 09:29:24.521467 18199 registrar.cpp:479] Successfully updated 'registry'
I0529 09:29:24.521507 18199 registrar.cpp:372] Successfully recovered registrar
I0529 09:29:24.521543 18199 log.cpp:699] Attempting to truncate the log to 1
I0529 09:29:24.521605 18199 master.cpp:815] Recovered 0 slaves from the Registry (83B) ; allowing
10mins for slaves to re-register
I0529 09:29:24.521646 18199 coordinator.cpp:340] Coordinator attempting to write TRUNCATE
action at position 2
I0529 09:29:24.521832 18199 replica.cpp:508] Replica received write request for position 2
I0529 09:29:24.522260 18199 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took
405954ns
I0529 09:29:24.522593 18199 replica.cpp:676] Persisted action at 2
I0529 09:29:24.522948 18199 replica.cpp:655] Replica received learned notice for position
2
I0529 09:29:24.524375 18199 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took
1.142005ms
I0529 09:29:24.524786 18199 leveldb.cpp:401] Deleting ~1 keys from leveldb took 24473ns
I0529 09:29:24.525178 18199 replica.cpp:676] Persisted action at 2
I0529 09:29:24.525440 18199 replica.cpp:661] Replica learned TRUNCATE action at position 2
I0529 09:29:24.528745 18175 sched.cpp:126] Version: 0.19.0
I0529 09:29:24.529137 18199 slave.cpp:143] Slave started on 136)@127.0.1.1:48472
I0529 09:29:24.529464 18199 credentials.hpp:35] Loading credentials for authentication from
'/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/credential'
I0529 09:29:24.529770 18199 slave.cpp:242] Slave using credential for: test-principal
I0529 09:29:24.529885 18199 slave.cpp:255] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000]
I0529 09:29:24.529955 18199 slave.cpp:283] Slave hostname: trusty
I0529 09:29:24.529968 18199 slave.cpp:284] Slave checkpoint: true
I0529 09:29:24.530727 18199 state.cpp:33] Recovering state from '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta'
I0529 09:29:24.530822 18199 status_update_manager.cpp:193] Recovering status update manager
I0529 09:29:24.530894 18199 slave.cpp:3018] Finished recovery
I0529 09:29:24.531134 18199 slave.cpp:536] New master detected at master@127.0.1.1:48472
I0529 09:29:24.531162 18199 slave.cpp:612] Authenticating with master master@127.0.1.1:48472
I0529 09:29:24.531199 18199 slave.cpp:585] Detecting new master
I0529 09:29:24.531229 18199 status_update_manager.cpp:167] New master detected at master@127.0.1.1:48472
I0529 09:29:24.531265 18199 authenticatee.hpp:128] Creating new client SASL connection
I0529 09:29:24.531412 18199 master.cpp:2986] Authenticating slave(136)@127.0.1.1:48472
I0529 09:29:24.531489 18199 authenticator.hpp:156] Creating new server SASL connection
I0529 09:29:24.531599 18199 authenticatee.hpp:219] Received SASL authentication mechanisms:
CRAM-MD5
I0529 09:29:24.531620 18199 authenticatee.hpp:245] Attempting to authenticate with mechanism
'CRAM-MD5'
I0529 09:29:24.531644 18199 authenticator.hpp:262] Received SASL authentication start
I0529 09:29:24.531680 18199 authenticator.hpp:384] Authentication requires more steps
I0529 09:29:24.531707 18199 authenticatee.hpp:265] Received SASL authentication step
I0529 09:29:24.531741 18199 authenticator.hpp:290] Received SASL authentication step
I0529 09:29:24.531759 18199 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'trusty' server FQDN: 'trusty' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: false 
I0529 09:29:24.531767 18199 auxprop.cpp:153] Looking up auxiliary property '*userPassword'
I0529 09:29:24.531780 18199 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0529 09:29:24.531790 18199 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'trusty' server FQDN: 'trusty' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: true 
I0529 09:29:24.531797 18199 auxprop.cpp:103] Skipping auxiliary property '*userPassword' since
SASL_AUXPROP_AUTHZID == true
I0529 09:29:24.531803 18199 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5'
since SASL_AUXPROP_AUTHZID == true
I0529 09:29:24.531827 18199 authenticator.hpp:376] Authentication success
I0529 09:29:24.531857 18199 authenticatee.hpp:305] Authentication success
I0529 09:29:24.531882 18199 master.cpp:3026] Successfully authenticated principal 'test-principal'
at slave(136)@127.0.1.1:48472
I0529 09:29:24.531945 18199 slave.cpp:669] Successfully authenticated with master master@127.0.1.1:48472
I0529 09:29:24.531986 18199 slave.cpp:902] Will retry registration in 3.076227ms if necessary
I0529 09:29:24.532193 18196 master.cpp:2302] Registering slave at slave(136)@127.0.1.1:48472
(trusty) with id 20140529-092924-16842879-48472-18175-0
I0529 09:29:24.532804 18197 registrar.cpp:422] Attempting to update the 'registry'
I0529 09:29:24.533882 18201 log.cpp:680] Attempting to append 292 bytes to the log
I0529 09:29:24.533936 18201 coordinator.cpp:340] Coordinator attempting to write APPEND action
at position 3
I0529 09:29:24.534139 18201 replica.cpp:508] Replica received write request for position 3
I0529 09:29:24.534694 18201 leveldb.cpp:343] Persisting action (311 bytes) to leveldb took
532639ns
I0529 09:29:24.534715 18201 replica.cpp:676] Persisted action at 3
I0529 09:29:24.535266 18194 replica.cpp:655] Replica received learned notice for position
3
I0529 09:29:24.536165 18196 sched.cpp:222] New master detected at master@127.0.1.1:48472
I0529 09:29:24.536190 18196 sched.cpp:273] Authenticating with master master@127.0.1.1:48472
I0529 09:29:24.536586 18194 leveldb.cpp:343] Persisting action (313 bytes) to leveldb took
1.293153ms
I0529 09:29:24.536613 18194 replica.cpp:676] Persisted action at 3
I0529 09:29:24.536623 18194 replica.cpp:661] Replica learned APPEND action at position 3
I0529 09:29:24.536893 18194 registrar.cpp:479] Successfully updated 'registry'
I0529 09:29:24.536964 18194 log.cpp:699] Attempting to truncate the log to 3
I0529 09:29:24.537014 18194 master.cpp:2342] Registered slave 20140529-092924-16842879-48472-18175-0
at slave(136)@127.0.1.1:48472 (trusty)
I0529 09:29:24.537031 18194 master.cpp:3472] Adding slave 20140529-092924-16842879-48472-18175-0
at slave(136)@127.0.1.1:48472 (trusty) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0529 09:29:24.537122 18194 coordinator.cpp:340] Coordinator attempting to write TRUNCATE
action at position 4
I0529 09:29:24.537215 18194 hierarchical_allocator_process.hpp:444] Added slave 20140529-092924-16842879-48472-18175-0
(trusty) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
I0529 09:29:24.537257 18194 hierarchical_allocator_process.hpp:707] Performed allocation for
slave 20140529-092924-16842879-48472-18175-0 in 9613ns
I0529 09:29:24.537389 18194 replica.cpp:508] Replica received write request for position 4
I0529 09:29:24.536259 18196 authenticatee.hpp:128] Creating new client SASL connection
I0529 09:29:24.538015 18196 master.cpp:2986] Authenticating scheduler(130)@127.0.1.1:48472
I0529 09:29:24.538091 18196 authenticator.hpp:156] Creating new server SASL connection
I0529 09:29:24.538194 18196 authenticatee.hpp:219] Received SASL authentication mechanisms:
CRAM-MD5
I0529 09:29:24.538214 18196 authenticatee.hpp:245] Attempting to authenticate with mechanism
'CRAM-MD5'
I0529 09:29:24.538238 18196 authenticator.hpp:262] Received SASL authentication start
I0529 09:29:24.538272 18196 authenticator.hpp:384] Authentication requires more steps
I0529 09:29:24.538297 18196 authenticatee.hpp:265] Received SASL authentication step
I0529 09:29:24.538331 18196 authenticator.hpp:290] Received SASL authentication step
I0529 09:29:24.538346 18196 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'trusty' server FQDN: 'trusty' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: false 
I0529 09:29:24.538354 18196 auxprop.cpp:153] Looking up auxiliary property '*userPassword'
I0529 09:29:24.538365 18196 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0529 09:29:24.538374 18196 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'trusty' server FQDN: 'trusty' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: true 
I0529 09:29:24.538383 18196 auxprop.cpp:103] Skipping auxiliary property '*userPassword' since
SASL_AUXPROP_AUTHZID == true
I0529 09:29:24.538388 18196 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5'
since SASL_AUXPROP_AUTHZID == true
I0529 09:29:24.538399 18196 authenticator.hpp:376] Authentication success
I0529 09:29:24.538424 18196 authenticatee.hpp:305] Authentication success
I0529 09:29:24.538570 18195 sched.cpp:347] Successfully authenticated with master master@127.0.1.1:48472
I0529 09:29:24.538591 18195 sched.cpp:466] Sending registration request to master@127.0.1.1:48472
I0529 09:29:24.538947 18201 slave.cpp:902] Will retry registration in 14.852555ms if necessary
I0529 09:29:24.539002 18201 slave.cpp:703] Registered with master master@127.0.1.1:48472;
given slave ID 20140529-092924-16842879-48472-18175-0
I0529 09:29:24.539147 18201 slave.cpp:716] Checkpointing SlaveInfo to '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/slave.info'
I0529 09:29:24.539445 18194 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took
2.035571ms
I0529 09:29:24.539471 18194 replica.cpp:676] Persisted action at 4
I0529 09:29:24.539649 18194 replica.cpp:655] Replica received learned notice for position
4
I0529 09:29:24.539990 18196 master.cpp:3026] Successfully authenticated principal 'test-principal'
at scheduler(130)@127.0.1.1:48472
I0529 09:29:24.540099 18196 master.cpp:1041] Received registration request from scheduler(130)@127.0.1.1:48472
I0529 09:29:24.540127 18196 master.cpp:1059] Registering framework 20140529-092924-16842879-48472-18175-0000
at scheduler(130)@127.0.1.1:48472
I0529 09:29:24.540189 18196 master.cpp:2278] Slave 20140529-092924-16842879-48472-18175-0
at slave(136)@127.0.1.1:48472 (trusty) already registered, resending acknowledgement
I0529 09:29:24.540230 18196 sched.cpp:397] Framework registered with 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.540249 18196 sched.cpp:411] Scheduler::registered took 9083ns
I0529 09:29:24.540282 18196 hierarchical_allocator_process.hpp:331] Added framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.540318 18196 hierarchical_allocator_process.hpp:751] Offering cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000] on slave 20140529-092924-16842879-48472-18175-0 to framework
20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.540398 18196 hierarchical_allocator_process.hpp:687] Performed allocation for
1 slaves in 103378ns
W0529 09:29:24.540426 18196 slave.cpp:728] Already registered with master master@127.0.1.1:48472
I0529 09:29:24.540478 18196 master.hpp:683] Adding offer 20140529-092924-16842879-48472-18175-0
with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140529-092924-16842879-48472-18175-0
(trusty)
I0529 09:29:24.540520 18196 master.cpp:2933] Sending 1 offers to framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.540613 18196 sched.cpp:534] Scheduler::resourceOffers took 13742ns
I0529 09:29:24.541463 18194 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took
1.792617ms
I0529 09:29:24.541507 18194 leveldb.cpp:401] Deleting ~2 keys from leveldb took 19301ns
I0529 09:29:24.541520 18194 replica.cpp:676] Persisted action at 4
I0529 09:29:24.541529 18194 replica.cpp:661] Replica learned TRUNCATE action at position 4
I0529 09:29:24.542551 18196 master.hpp:693] Removing offer 20140529-092924-16842879-48472-18175-0
with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140529-092924-16842879-48472-18175-0
(trusty)
I0529 09:29:24.542899 18196 master.cpp:1889] Processing reply for offers: [ 20140529-092924-16842879-48472-18175-0
] on slave 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472 (trusty) for
framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.543246 18196 master.hpp:655] Adding task 24a52e8f-0b30-4970-8829-a7383f19980e
with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140529-092924-16842879-48472-18175-0
(trusty)
I0529 09:29:24.543572 18196 master.cpp:3111] Launching task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000 with resources cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000] on slave 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472
(trusty)
I0529 09:29:24.544034 18195 slave.cpp:933] Got assigned task 24a52e8f-0b30-4970-8829-a7383f19980e
for framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.544123 18195 slave.cpp:3290] Checkpointing FrameworkInfo to '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/framework.info'
I0529 09:29:24.544325 18195 slave.cpp:3297] Checkpointing framework pid 'scheduler(130)@127.0.1.1:48472'
to '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/framework.pid'
I0529 09:29:24.544543 18195 slave.cpp:1043] Launching task 24a52e8f-0b30-4970-8829-a7383f19980e
for framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.545670 18195 slave.cpp:3612] Checkpointing ExecutorInfo to '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/executor.info'
I0529 09:29:24.546325 18195 slave.cpp:3727] Checkpointing TaskInfo to '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/runs/7fbaa2b9-d41e-4d3e-897f-3d593b1308c7/tasks/24a52e8f-0b30-4970-8829-a7383f19980e/task.info'
I0529 09:29:24.546923 18195 slave.cpp:1153] Queuing task '24a52e8f-0b30-4970-8829-a7383f19980e'
for executor 24a52e8f-0b30-4970-8829-a7383f19980e of framework '20140529-092924-16842879-48472-18175-0000
E0529 09:29:24.547292 18195 slave.cpp:2397] Termination of executor '24a52e8f-0b30-4970-8829-a7383f19980e'
of framework '20140529-092924-16842879-48472-18175-0000' failed: Unknown container: 7fbaa2b9-d41e-4d3e-897f-3d593b1308c7
E0529 09:29:24.547698 18195 slave.cpp:2686] Failed to unmonitor container for executor 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000: Not monitored
I0529 09:29:24.548846 18195 slave.cpp:2018] Handling status update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95)
for task 24a52e8f-0b30-4970-8829-a7383f19980e of framework 20140529-092924-16842879-48472-18175-0000
from @0.0.0.0:0
I0529 09:29:24.549253 18195 slave.cpp:3660] Terminating task 24a52e8f-0b30-4970-8829-a7383f19980e
I0529 09:29:24.549710 18194 status_update_manager.cpp:320] Received status update TASK_FAILED
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.549733 18194 status_update_manager.cpp:499] Creating StatusUpdate stream for
task 24a52e8f-0b30-4970-8829-a7383f19980e of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.549865 18194 status_update_manager.hpp:342] Checkpointing UPDATE for status
update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.550242 18195 slave.cpp:425] Slave terminating
I0529 09:29:24.550930 18198 master.cpp:712] Slave 20140529-092924-16842879-48472-18175-0 at
slave(136)@127.0.1.1:48472 (trusty) disconnected
I0529 09:29:24.550952 18198 master.cpp:1344] Disconnecting slave 20140529-092924-16842879-48472-18175-0
I0529 09:29:24.551013 18194 status_update_manager.cpp:373] Forwarding status update TASK_FAILED
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000 to master@127.0.1.1:48472
I0529 09:29:24.551261 18198 hierarchical_allocator_process.hpp:483] Slave 20140529-092924-16842879-48472-18175-0
disconnected
I0529 09:29:24.551419 18194 master.cpp:2628] Status update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95)
for task 24a52e8f-0b30-4970-8829-a7383f19980e of framework 20140529-092924-16842879-48472-18175-0000
from slave 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472 (trusty)
I0529 09:29:24.551475 18194 master.hpp:673] Removing task 24a52e8f-0b30-4970-8829-a7383f19980e
with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140529-092924-16842879-48472-18175-0
(trusty)

GMOCK WARNING:
Uninteresting mock function call - returning directly.
    Function call: statusUpdate(0x7fff69ebf410, @0x2ad9c80de320 72-byte object <30-17 18-A5
D9-2A 00-00 00-00 00-00 00-00 00-00 D0-6B 0E-C8 D9-2A 00-00 50-4A 0E-C8 D9-2A 00-00 38-30
D1-00 00-00 00-00 90-F1 0D-C8 D9-2A 00-00 1B-13 63-6B 13-EB D4-41 03-00 00-00 00-00 00-00
37-00 00-00 00-00 00-00>)
Stack trace:
I0529 09:29:24.551668 18194 sched.cpp:625] Scheduler::statusUpdate took 87069ns
I0529 09:29:24.552121 18194 hierarchical_allocator_process.hpp:636] Recovered cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000] (total allocatable: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000]) on slave 20140529-092924-16842879-48472-18175-0 from framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.560880 18197 slave.cpp:143] Slave started on 137)@127.0.1.1:48472
I0529 09:29:24.560909 18197 credentials.hpp:35] Loading credentials for authentication from
'/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/credential'
I0529 09:29:24.560972 18197 slave.cpp:242] Slave using credential for: test-principal
I0529 09:29:24.561058 18197 slave.cpp:255] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000]
I0529 09:29:24.561120 18197 slave.cpp:283] Slave hostname: trusty
I0529 09:29:24.561133 18197 slave.cpp:284] Slave checkpoint: true
I0529 09:29:24.561491 18197 state.cpp:33] Recovering state from '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta'
W0529 09:29:24.562824 18197 state.cpp:426] Failed to find executor forked pid file '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/runs/7fbaa2b9-d41e-4d3e-897f-3d593b1308c7/pids/forked.pid'
I0529 09:29:24.563506 18201 slave.cpp:3086] Recovering framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.563535 18201 slave.cpp:3462] Recovering executor '24a52e8f-0b30-4970-8829-a7383f19980e'
of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.563608 18201 slave.cpp:3660] Terminating task 24a52e8f-0b30-4970-8829-a7383f19980e
I0529 09:29:24.564241 18201 status_update_manager.cpp:193] Recovering status update manager
I0529 09:29:24.564257 18201 status_update_manager.cpp:201] Recovering executor '24a52e8f-0b30-4970-8829-a7383f19980e'
of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.564275 18201 status_update_manager.cpp:499] Creating StatusUpdate stream for
task 24a52e8f-0b30-4970-8829-a7383f19980e of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.564371 18201 status_update_manager.hpp:306] Replaying status update stream
for task 24a52e8f-0b30-4970-8829-a7383f19980e
W0529 09:29:24.564391 18201 status_update_manager.cpp:381] Not forwarding status update TASK_FAILED
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000 because no master is elected yet
I0529 09:29:24.564512 18201 slave.cpp:497] Successfully attached file '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/runs/7fbaa2b9-d41e-4d3e-897f-3d593b1308c7'
I0529 09:29:24.564915 18194 slave.cpp:2967] Unable to reconnect to executor '24a52e8f-0b30-4970-8829-a7383f19980e'
of framework 20140529-092924-16842879-48472-18175-0000 because no libprocess PID was found
E0529 09:29:24.564950 18194 slave.cpp:2397] Termination of executor '24a52e8f-0b30-4970-8829-a7383f19980e'
of framework '20140529-092924-16842879-48472-18175-0000' failed: Unknown container: 7fbaa2b9-d41e-4d3e-897f-3d593b1308c7
E0529 09:29:24.565001 18194 slave.cpp:2686] Failed to unmonitor container for executor 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000: Not monitored
I0529 09:29:24.581280 18198 hierarchical_allocator_process.hpp:687] Performed allocation for
1 slaves in 56677ns
I0529 09:29:24.581333 18198 slave.cpp:1967] Cleaning up un-reregistered executors
I0529 09:29:24.581362 18198 slave.cpp:3018] Finished recovery
I0529 09:29:24.581876 18198 slave.cpp:536] New master detected at master@127.0.1.1:48472
I0529 09:29:24.581909 18198 slave.cpp:612] Authenticating with master master@127.0.1.1:48472
I0529 09:29:24.581959 18198 slave.cpp:585] Detecting new master
I0529 09:29:24.581992 18198 status_update_manager.cpp:167] New master detected at master@127.0.1.1:48472
W0529 09:29:24.582008 18198 status_update_manager.cpp:181] Resending status update TASK_FAILED
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.582025 18198 status_update_manager.cpp:373] Forwarding status update TASK_FAILED
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000 to master@127.0.1.1:48472
I0529 09:29:24.582099 18198 authenticatee.hpp:128] Creating new client SASL connection
W0529 09:29:24.582298 18198 master.cpp:2621] Could not lookup task for status update TASK_FAILED
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000 from slave 20140529-092924-16842879-48472-18175-0
at slave(136)@127.0.1.1:48472 (trusty)
I0529 09:29:24.582356 18198 master.cpp:2986] Authenticating slave(137)@127.0.1.1:48472
I0529 09:29:24.582557 18195 authenticator.hpp:156] Creating new server SASL connection
I0529 09:29:24.582679 18195 authenticatee.hpp:219] Received SASL authentication mechanisms:
CRAM-MD5
I0529 09:29:24.582700 18195 authenticatee.hpp:245] Attempting to authenticate with mechanism
'CRAM-MD5'
I0529 09:29:24.582726 18195 authenticator.hpp:262] Received SASL authentication start
I0529 09:29:24.582770 18195 authenticator.hpp:384] Authentication requires more steps
I0529 09:29:24.582798 18195 authenticatee.hpp:265] Received SASL authentication step
I0529 09:29:24.582834 18195 authenticator.hpp:290] Received SASL authentication step
I0529 09:29:24.582852 18195 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'trusty' server FQDN: 'trusty' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: false 
I0529 09:29:24.582861 18195 auxprop.cpp:153] Looking up auxiliary property '*userPassword'
I0529 09:29:24.582875 18195 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0529 09:29:24.582885 18195 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'trusty' server FQDN: 'trusty' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: true 
I0529 09:29:24.582892 18195 auxprop.cpp:103] Skipping auxiliary property '*userPassword' since
SASL_AUXPROP_AUTHZID == true
I0529 09:29:24.582898 18195 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5'
since SASL_AUXPROP_AUTHZID == true
I0529 09:29:24.582911 18195 authenticator.hpp:376] Authentication success
I0529 09:29:24.582938 18195 authenticatee.hpp:305] Authentication success
I0529 09:29:24.582962 18195 master.cpp:3026] Successfully authenticated principal 'test-principal'
at slave(137)@127.0.1.1:48472
I0529 09:29:24.583034 18195 slave.cpp:669] Successfully authenticated with master master@127.0.1.1:48472
I0529 09:29:24.583108 18195 slave.cpp:902] Will retry registration in 13.697338ms if necessary
W0529 09:29:24.583199 18195 master.cpp:2417] Slave at slave(137)@127.0.1.1:48472 (trusty)
is being allowed to re-register with an already in use id (20140529-092924-16842879-48472-18175-0)
I0529 09:29:24.583282 18195 slave.cpp:753] Re-registered with master master@127.0.1.1:48472
I0529 09:29:24.583312 18195 slave.cpp:1514] Updating framework 20140529-092924-16842879-48472-18175-0000
pid to scheduler(130)@127.0.1.1:48472
I0529 09:29:24.583348 18195 slave.cpp:1522] Checkpointing framework pid 'scheduler(130)@127.0.1.1:48472'
to '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/framework.pid'
I0529 09:29:24.583536 18195 hierarchical_allocator_process.hpp:497] Slave 20140529-092924-16842879-48472-18175-0
reconnected
W0529 09:29:24.583560 18195 status_update_manager.cpp:181] Resending status update TASK_FAILED
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.583573 18195 status_update_manager.cpp:373] Forwarding status update TASK_FAILED
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000 to master@127.0.1.1:48472
W0529 09:29:24.583658 18195 master.cpp:2621] Could not lookup task for status update TASK_FAILED
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000 from slave 20140529-092924-16842879-48472-18175-0
at slave(137)@127.0.1.1:48472 (trusty)
I0529 09:29:24.584329 18198 sched.cpp:625] Scheduler::statusUpdate took 16101ns
tests/slave_recovery_tests.cpp:3068: Failure
Mock function called more times than expected - returning directly.
    Function call: statusUpdate(0x7fff69ebf410, @0x2ad9b40bc400 72-byte object <30-17 18-A5
D9-2A 00-00 00-00 00-00 00-00 00-00 90-31 03-B4 D9-2A 00-00 B0-B1 0F-B4 D9-2A 00-00 38-30
D1-00 00-00 00-00 A0-39 0C-B4 D9-2A 00-00 1B-13 63-6B 13-EB D4-41 03-00 00-00 00-00 00-00
37-00 00-00 00-00 00-00>)
         Expected: to be called once
           Actual: called twice - over-saturated and active
I0529 09:29:24.584525 18198 sched.cpp:625] Scheduler::statusUpdate took 121892ns
I0529 09:29:24.584609 18198 status_update_manager.cpp:398] Received status update acknowledgement
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.584632 18198 status_update_manager.hpp:342] Checkpointing ACK for status update
TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.585109 18194 sched.cpp:730] Stopping framework '20140529-092924-16842879-48472-18175-0000'
I0529 09:29:24.585196 18196 master.cpp:1272] Asked to unregister framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.585211 18196 master.cpp:3344] Removing framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.585304 18196 hierarchical_allocator_process.hpp:407] Deactivated framework
20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.585342 18196 hierarchical_allocator_process.hpp:362] Removed framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.585366 18196 slave.cpp:1337] Asked to shut down framework 20140529-092924-16842879-48472-18175-0000
by master@127.0.1.1:48472
I0529 09:29:24.585381 18196 slave.cpp:1362] Shutting down framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.585396 18196 slave.cpp:2552] Cleaning up executor '24a52e8f-0b30-4970-8829-a7383f19980e'
of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.585631 18196 slave.cpp:2627] Cleaning up framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.585710 18196 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/runs/7fbaa2b9-d41e-4d3e-897f-3d593b1308c7'
for gc 6.99999322304296days in the future
I0529 09:29:24.585954 18196 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e'
for gc 6.99999322254518days in the future
I0529 09:29:24.586061 18196 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/runs/7fbaa2b9-d41e-4d3e-897f-3d593b1308c7'
for gc 6.99999322225778days in the future
I0529 09:29:24.586094 18196 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e'
for gc 6.99999322199111days in the future
I0529 09:29:24.586122 18196 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000'
for gc 6.99999322143407days in the future
I0529 09:29:24.586148 18196 gc.cpp:56] Scheduling '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000'
for gc 6.99999322119111days in the future
I0529 09:29:24.586599 18196 master.cpp:574] Master terminating
I0529 09:29:24.589107 18199 slave.cpp:2224] master@127.0.1.1:48472 exited
W0529 09:29:24.589130 18199 slave.cpp:2227] Master disconnected! Waiting for a new master
to be elected
I0529 09:29:24.589537 18198 status_update_manager.cpp:530] Cleaning up status update stream
for task 24a52e8f-0b30-4970-8829-a7383f19980e of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.589920 18198 status_update_manager.cpp:398] Received status update acknowledgement
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.590104 18199 slave.cpp:1604] Status update manager successfully handled status
update acknowledgement (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000
E0529 09:29:24.590286 18199 slave.cpp:1615] Status update acknowledgement (UUID: aee0620e-c349-42ce-8cce-7b3835adad95)
for task 24a52e8f-0b30-4970-8829-a7383f19980e of unknown framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.590569 18198 status_update_manager.cpp:282] Closing status update streams for
framework 20140529-092924-16842879-48472-18175-0000
E0529 09:29:24.590759 18194 slave.cpp:1597] Failed to handle status update acknowledgement
(UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task 24a52e8f-0b30-4970-8829-a7383f19980e
of framework 20140529-092924-16842879-48472-18175-0000: Cannot find the status update stream
for task 24a52e8f-0b30-4970-8829-a7383f19980e of framework 20140529-092924-16842879-48472-18175-0000
I0529 09:29:24.592489 18175 slave.cpp:425] Slave terminating
[  FAILED  ] SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch, where TypeParam = mesos::internal::slave::MesosContainerizer
(112 ms)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message