mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominic Hamon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MESOS-1513) FaultToleranceTest.SlaveReregisterTerminatedExecutor is flaky
Date Fri, 20 Jun 2014 21:34:25 GMT

     [ https://issues.apache.org/jira/browse/MESOS-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dominic Hamon updated MESOS-1513:
---------------------------------

    Sprint:   (was: Q2 Sprint 4)

> FaultToleranceTest.SlaveReregisterTerminatedExecutor is flaky
> -------------------------------------------------------------
>
>                 Key: MESOS-1513
>                 URL: https://issues.apache.org/jira/browse/MESOS-1513
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Dominic Hamon
>
> {{noformat}}
> [ RUN      ] FaultToleranceTest.SlaveReregisterTerminatedExecutor
> Using temporary directory '/tmp/FaultToleranceTest_SlaveReregisterTerminatedExecutor_hyQUfB'
> I0618 19:47:59.099588 14356 leveldb.cpp:176] Opened db in 3.044283ms
> I0618 19:47:59.100831 14356 leveldb.cpp:183] Compacted db in 1.000609ms
> I0618 19:47:59.101204 14356 leveldb.cpp:198] Created db iterator in 5648ns
> I0618 19:47:59.101450 14356 leveldb.cpp:204] Seeked to beginning of db in 1360ns
> I0618 19:47:59.101721 14356 leveldb.cpp:273] Iterated through 0 keys in the db in 754ns
> I0618 19:47:59.102061 14356 replica.cpp:741] Replica recovered with log positions 0 ->
0 with 1 holes and 0 unlearned
> I0618 19:47:59.102519 14380 recover.cpp:425] Starting replica recovery
> I0618 19:47:59.102592 14380 recover.cpp:451] Replica is in EMPTY status
> I0618 19:47:59.103059 14380 replica.cpp:638] Replica in EMPTY status received a broadcasted
recover request
> I0618 19:47:59.103150 14380 recover.cpp:188] Received a recover response from a replica
in EMPTY status
> I0618 19:47:59.103260 14380 recover.cpp:542] Updating replica status to STARTING
> I0618 19:47:59.104123 14380 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb
took 805365ns
> I0618 19:47:59.104143 14380 replica.cpp:320] Persisted replica status to STARTING
> I0618 19:47:59.104204 14380 recover.cpp:451] Replica is in STARTING status
> I0618 19:47:59.104439 14380 replica.cpp:638] Replica in STARTING status received a broadcasted
recover request
> I0618 19:47:59.104490 14380 recover.cpp:188] Received a recover response from a replica
in STARTING status
> I0618 19:47:59.104579 14380 recover.cpp:542] Updating replica status to VOTING
> I0618 19:47:59.104918 14380 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb
took 289613ns
> I0618 19:47:59.104938 14380 replica.cpp:320] Persisted replica status to VOTING
> I0618 19:47:59.105159 14382 recover.cpp:556] Successfully joined the Paxos group
> I0618 19:47:59.105455 14379 recover.cpp:440] Recover process terminated
> I0618 19:47:59.113114 14377 master.cpp:286] Master 20140618-194759-16842879-51994-14356
(saucy) started on 127.0.1.1:51994
> I0618 19:47:59.113142 14377 master.cpp:323] Master only allowing authenticated frameworks
to register
> I0618 19:47:59.113148 14377 master.cpp:328] Master only allowing authenticated slaves
to register
> I0618 19:47:59.113155 14377 credentials.hpp:35] Loading credentials for authentication
from '/tmp/FaultToleranceTest_SlaveReregisterTerminatedExecutor_hyQUfB/credentials'
> I0618 19:47:59.113234 14377 master.cpp:354] Authorization enabled
> I0618 19:47:59.113674 14377 hierarchical_allocator_process.hpp:301] Initializing hierarchical
allocator process with master : master@127.0.1.1:51994
> I0618 19:47:59.113711 14377 master.cpp:120] No whitelist given. Advertising offers for
all slaves
> I0618 19:47:59.113859 14377 master.cpp:1077] The newly elected leader is master@127.0.1.1:51994
with id 20140618-194759-16842879-51994-14356
> I0618 19:47:59.113870 14377 master.cpp:1090] Elected as the leading master!
> I0618 19:47:59.113876 14377 master.cpp:908] Recovering from registrar
> I0618 19:47:59.113921 14377 registrar.cpp:313] Recovering registrar
> I0618 19:47:59.114109 14377 log.cpp:656] Attempting to start the writer
> I0618 19:47:59.114375 14377 replica.cpp:474] Replica received implicit promise request
with proposal 1
> I0618 19:47:59.115440 14377 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb
took 1.045169ms
> I0618 19:47:59.115787 14377 replica.cpp:342] Persisted promised to 1
> I0618 19:47:59.118927 14381 coordinator.cpp:230] Coordinator attemping to fill missing
position
> I0618 19:47:59.119233 14381 replica.cpp:375] Replica received explicit promise request
for position 0 with proposal 2
> I0618 19:47:59.120156 14381 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took
900874ns
> I0618 19:47:59.120177 14381 replica.cpp:676] Persisted action at 0
> I0618 19:47:59.120421 14381 replica.cpp:508] Replica received write request for position
0
> I0618 19:47:59.120450 14381 leveldb.cpp:438] Reading position from leveldb took 14206ns
> I0618 19:47:59.120762 14381 leveldb.cpp:343] Persisting action (14 bytes) to leveldb
took 295538ns
> I0618 19:47:59.122448 14381 replica.cpp:676] Persisted action at 0
> I0618 19:47:59.122895 14381 replica.cpp:655] Replica received learned notice for position
0
> I0618 19:47:59.123711 14381 leveldb.cpp:343] Persisting action (16 bytes) to leveldb
took 791464ns
> I0618 19:47:59.124047 14381 replica.cpp:676] Persisted action at 0
> I0618 19:47:59.124305 14381 replica.cpp:661] Replica learned NOP action at position 0
> I0618 19:47:59.124744 14381 log.cpp:672] Writer started with ending position 0
> I0618 19:47:59.125270 14381 leveldb.cpp:438] Reading position from leveldb took 12943ns
> I0618 19:47:59.134912 14381 registrar.cpp:346] Successfully fetched the registry (0B)
> I0618 19:47:59.134948 14381 registrar.cpp:422] Attempting to update the 'registry'
> I0618 19:47:59.136394 14378 log.cpp:680] Attempting to append 118 bytes to the log
> I0618 19:47:59.136442 14378 coordinator.cpp:340] Coordinator attempting to write APPEND
action at position 1
> I0618 19:47:59.136651 14378 replica.cpp:508] Replica received write request for position
1
> I0618 19:47:59.137675 14378 leveldb.cpp:343] Persisting action (135 bytes) to leveldb
took 1.001689ms
> I0618 19:47:59.137712 14378 replica.cpp:676] Persisted action at 1
> I0618 19:47:59.137883 14378 replica.cpp:655] Replica received learned notice for position
1
> I0618 19:47:59.138236 14378 leveldb.cpp:343] Persisting action (137 bytes) to leveldb
took 334245ns
> I0618 19:47:59.138255 14378 replica.cpp:676] Persisted action at 1
> I0618 19:47:59.138262 14378 replica.cpp:661] Replica learned APPEND action at position
1
> I0618 19:47:59.138502 14378 registrar.cpp:479] Successfully updated 'registry'
> I0618 19:47:59.138538 14378 registrar.cpp:372] Successfully recovered registrar
> I0618 19:47:59.138574 14378 log.cpp:699] Attempting to truncate the log to 1
> I0618 19:47:59.138761 14377 coordinator.cpp:340] Coordinator attempting to write TRUNCATE
action at position 2
> I0618 19:47:59.138967 14377 replica.cpp:508] Replica received write request for position
2
> I0618 19:47:59.140679 14377 leveldb.cpp:343] Persisting action (16 bytes) to leveldb
took 1.693684ms
> I0618 19:47:59.140707 14377 replica.cpp:676] Persisted action at 2
> I0618 19:47:59.140858 14377 replica.cpp:655] Replica received learned notice for position
2
> I0618 19:47:59.141084 14377 leveldb.cpp:343] Persisting action (18 bytes) to leveldb
took 208545ns
> I0618 19:47:59.141221 14377 leveldb.cpp:401] Deleting ~1 keys from leveldb took 18933ns
> I0618 19:47:59.141233 14377 replica.cpp:676] Persisted action at 2
> I0618 19:47:59.141240 14377 replica.cpp:661] Replica learned TRUNCATE action at position
2
> I0618 19:47:59.142946 14377 slave.cpp:143] Slave started on 159)@127.0.1.1:51994
> I0618 19:47:59.142966 14377 credentials.hpp:35] Loading credentials for authentication
from '/tmp/FaultToleranceTest_SlaveReregisterTerminatedExecutor_vvem0g/credential'
> I0618 19:47:59.143029 14377 slave.cpp:243] Slave using credential for: test-principal
> I0618 19:47:59.143143 14377 slave.cpp:256] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000]
> I0618 19:47:59.143211 14377 slave.cpp:284] Slave hostname: saucy
> I0618 19:47:59.143223 14377 slave.cpp:285] Slave checkpoint: false
> I0618 19:47:59.143540 14377 state.cpp:33] Recovering state from '/tmp/FaultToleranceTest_SlaveReregisterTerminatedExecutor_vvem0g/meta'
> I0618 19:47:59.143622 14377 status_update_manager.cpp:193] Recovering status update manager
> I0618 19:47:59.143692 14377 slave.cpp:3068] Finished recovery
> I0618 19:47:59.143914 14377 slave.cpp:537] New master detected at master@127.0.1.1:51994
> I0618 19:47:59.143939 14377 slave.cpp:613] Authenticating with master master@127.0.1.1:51994
> I0618 19:47:59.143975 14377 slave.cpp:586] Detecting new master
> I0618 19:47:59.144003 14377 status_update_manager.cpp:167] New master detected at master@127.0.1.1:51994
> I0618 19:47:59.144034 14377 authenticatee.hpp:128] Creating new client SASL connection
> I0618 19:47:59.138634 14378 master.cpp:935] Recovered 0 slaves from the Registry (82B)
; allowing 10mins for slaves to re-register
> I0618 19:47:59.144620 14378 master.cpp:3457] Authenticating slave(159)@127.0.1.1:51994
> I0618 19:47:59.144870 14378 authenticator.hpp:156] Creating new server SASL connection
> I0618 19:47:59.144981 14378 authenticatee.hpp:219] Received SASL authentication mechanisms:
CRAM-MD5
> I0618 19:47:59.144999 14378 authenticatee.hpp:245] Attempting to authenticate with mechanism
'CRAM-MD5'
> I0618 19:47:59.145021 14378 authenticator.hpp:262] Received SASL authentication start
> I0618 19:47:59.145056 14378 authenticator.hpp:384] Authentication requires more steps
> I0618 19:47:59.145081 14378 authenticatee.hpp:265] Received SASL authentication step
> I0618 19:47:59.145112 14378 authenticator.hpp:290] Received SASL authentication step
> I0618 19:47:59.145128 14378 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'saucy' server FQDN: 'saucy' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: false 
> I0618 19:47:59.145135 14378 auxprop.cpp:153] Looking up auxiliary property '*userPassword'
> I0618 19:47:59.145145 14378 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
> I0618 19:47:59.145154 14378 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'saucy' server FQDN: 'saucy' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: true 
> I0618 19:47:59.145159 14378 auxprop.cpp:103] Skipping auxiliary property '*userPassword'
since SASL_AUXPROP_AUTHZID == true
> I0618 19:47:59.145164 14378 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5'
since SASL_AUXPROP_AUTHZID == true
> I0618 19:47:59.145175 14378 authenticator.hpp:376] Authentication success
> I0618 19:47:59.145200 14378 authenticatee.hpp:305] Authentication success
> I0618 19:47:59.145218 14378 master.cpp:3497] Successfully authenticated principal 'test-principal'
at slave(159)@127.0.1.1:51994
> I0618 19:47:59.145282 14378 slave.cpp:670] Successfully authenticated with master master@127.0.1.1:51994
> I0618 19:47:59.145320 14378 slave.cpp:908] Will retry registration in 8.965031ms if necessary
> I0618 19:47:59.145618 14378 master.cpp:2736] Registering slave at slave(159)@127.0.1.1:51994
(saucy) with id 20140618-194759-16842879-51994-14356-0
> I0618 19:47:59.145982 14378 registrar.cpp:422] Attempting to update the 'registry'
> I0618 19:47:59.149333 14380 log.cpp:680] Attempting to append 289 bytes to the log
> I0618 19:47:59.149391 14380 coordinator.cpp:340] Coordinator attempting to write APPEND
action at position 3
> I0618 19:47:59.149591 14380 replica.cpp:508] Replica received write request for position
3
> I0618 19:47:59.150043 14380 leveldb.cpp:343] Persisting action (308 bytes) to leveldb
took 431495ns
> I0618 19:47:59.150063 14380 replica.cpp:676] Persisted action at 3
> I0618 19:47:59.150252 14380 replica.cpp:655] Replica received learned notice for position
3
> I0618 19:47:59.151007 14380 leveldb.cpp:343] Persisting action (310 bytes) to leveldb
took 683115ns
> I0618 19:47:59.151027 14380 replica.cpp:676] Persisted action at 3
> I0618 19:47:59.151036 14380 replica.cpp:661] Replica learned APPEND action at position
3
> I0618 19:47:59.151341 14380 registrar.cpp:479] Successfully updated 'registry'
> I0618 19:47:59.151407 14380 log.cpp:699] Attempting to truncate the log to 3
> I0618 19:47:59.151455 14380 master.cpp:2776] Registered slave 20140618-194759-16842879-51994-14356-0
at slave(159)@127.0.1.1:51994 (saucy)
> I0618 19:47:59.151470 14380 master.cpp:3923] Adding slave 20140618-194759-16842879-51994-14356-0
at slave(159)@127.0.1.1:51994 (saucy) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0618 19:47:59.151561 14380 coordinator.cpp:340] Coordinator attempting to write TRUNCATE
action at position 4
> I0618 19:47:59.151623 14380 slave.cpp:704] Registered with master master@127.0.1.1:51994;
given slave ID 20140618-194759-16842879-51994-14356-0
> I0618 19:47:59.151691 14380 hierarchical_allocator_process.hpp:444] Added slave 20140618-194759-16842879-51994-14356-0
(saucy) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
> I0618 19:47:59.151726 14380 hierarchical_allocator_process.hpp:706] Performed allocation
for slave 20140618-194759-16842879-51994-14356-0 in 7963ns
> I0618 19:47:59.151861 14380 replica.cpp:508] Replica received write request for position
4
> I0618 19:47:59.152423 14356 sched.cpp:139] Version: 0.20.0
> I0618 19:47:59.152597 14380 leveldb.cpp:343] Persisting action (16 bytes) to leveldb
took 718143ns
> I0618 19:47:59.153003 14380 replica.cpp:676] Persisted action at 4
> I0618 19:47:59.156749 14377 replica.cpp:655] Replica received learned notice for position
4
> I0618 19:47:59.157306 14379 sched.cpp:235] New master detected at master@127.0.1.1:51994
> I0618 19:47:59.157331 14379 sched.cpp:285] Authenticating with master master@127.0.1.1:51994
> I0618 19:47:59.157436 14379 authenticatee.hpp:128] Creating new client SASL connection
> I0618 19:47:59.157707 14379 master.cpp:3457] Authenticating scheduler-be3b06f5-f171-4a1b-be80-e364044f4856@127.0.1.1:51994
> I0618 19:47:59.157793 14379 authenticator.hpp:156] Creating new server SASL connection
> I0618 19:47:59.157899 14379 authenticatee.hpp:219] Received SASL authentication mechanisms:
CRAM-MD5
> I0618 19:47:59.157917 14379 authenticatee.hpp:245] Attempting to authenticate with mechanism
'CRAM-MD5'
> I0618 19:47:59.157940 14379 authenticator.hpp:262] Received SASL authentication start
> I0618 19:47:59.157982 14379 authenticator.hpp:384] Authentication requires more steps
> I0618 19:47:59.158007 14379 authenticatee.hpp:265] Received SASL authentication step
> I0618 19:47:59.158041 14379 authenticator.hpp:290] Received SASL authentication step
> I0618 19:47:59.158056 14379 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'saucy' server FQDN: 'saucy' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: false 
> I0618 19:47:59.158062 14379 auxprop.cpp:153] Looking up auxiliary property '*userPassword'
> I0618 19:47:59.158072 14379 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
> I0618 19:47:59.158081 14379 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'saucy' server FQDN: 'saucy' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: true 
> I0618 19:47:59.158087 14379 auxprop.cpp:103] Skipping auxiliary property '*userPassword'
since SASL_AUXPROP_AUTHZID == true
> I0618 19:47:59.158092 14379 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5'
since SASL_AUXPROP_AUTHZID == true
> I0618 19:47:59.158100 14379 authenticator.hpp:376] Authentication success
> I0618 19:47:59.158125 14379 authenticatee.hpp:305] Authentication success
> I0618 19:47:59.158145 14379 master.cpp:3497] Successfully authenticated principal 'test-principal'
at scheduler-be3b06f5-f171-4a1b-be80-e364044f4856@127.0.1.1:51994
> I0618 19:47:59.158210 14379 sched.cpp:359] Successfully authenticated with master master@127.0.1.1:51994
> I0618 19:47:59.158223 14379 sched.cpp:478] Sending registration request to master@127.0.1.1:51994
> I0618 19:47:59.158265 14379 master.cpp:1196] Received registration request from scheduler-be3b06f5-f171-4a1b-be80-e364044f4856@127.0.1.1:51994
> I0618 19:47:59.158282 14379 master.cpp:1156] Authorizing framework principal 'test-principal'
to receive offers for role '*'
> I0618 19:47:59.158370 14379 master.cpp:1255] Registering framework 20140618-194759-16842879-51994-14356-0000
at scheduler-be3b06f5-f171-4a1b-be80-e364044f4856@127.0.1.1:51994
> I0618 19:47:59.158449 14379 sched.cpp:409] Framework registered with 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.158471 14379 sched.cpp:423] Scheduler::registered took 9823ns
> I0618 19:47:59.158503 14379 hierarchical_allocator_process.hpp:331] Added framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.158543 14379 hierarchical_allocator_process.hpp:750] Offering cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140618-194759-16842879-51994-14356-0
to framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.158633 14379 hierarchical_allocator_process.hpp:686] Performed allocation
for 1 slaves in 120599ns
> I0618 19:47:59.158941 14377 leveldb.cpp:343] Persisting action (18 bytes) to leveldb
took 2.166488ms
> I0618 19:47:59.159124 14377 leveldb.cpp:401] Deleting ~2 keys from leveldb took 52640ns
> I0618 19:47:59.159147 14377 replica.cpp:676] Persisted action at 4
> I0618 19:47:59.159162 14377 replica.cpp:661] Replica learned TRUNCATE action at position
4
> I0618 19:47:59.158710 14379 master.hpp:785] Adding offer 20140618-194759-16842879-51994-14356-0
with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140618-194759-16842879-51994-14356-0
(saucy)
> I0618 19:47:59.159503 14379 master.cpp:3404] Sending 1 offers to framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.159909 14379 sched.cpp:546] Scheduler::resourceOffers took 299955ns
> I0618 19:47:59.160341 14379 master.hpp:795] Removing offer 20140618-194759-16842879-51994-14356-0
with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140618-194759-16842879-51994-14356-0
(saucy)
> I0618 19:47:59.160681 14379 master.cpp:2080] Processing reply for offers: [ 20140618-194759-16842879-51994-14356-0
] on slave 20140618-194759-16842879-51994-14356-0 at slave(159)@127.0.1.1:51994 (saucy) for
framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.160951 14379 master.cpp:2166] Authorizing framework principal 'test-principal'
to launch task 0 as user 'jenkins'
> I0618 19:47:59.161492 14379 master.hpp:757] Adding task 0 with resources cpus(*):1; mem(*):512
on slave 20140618-194759-16842879-51994-14356-0 (saucy)
> I0618 19:47:59.161831 14379 master.cpp:2232] Launching task 0 of framework 20140618-194759-16842879-51994-14356-0000
with resources cpus(*):1; mem(*):512 on slave 20140618-194759-16842879-51994-14356-0 at slave(159)@127.0.1.1:51994
(saucy)
> I0618 19:47:59.162240 14383 slave.cpp:939] Got assigned task 0 for framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.162420 14383 slave.cpp:1049] Launching task 0 for framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.165210 14383 exec.cpp:131] Version: 0.20.0
> I0618 19:47:59.165776 14384 exec.cpp:181] Executor started at: executor(39)@127.0.1.1:51994
with pid 14356
> I0618 19:47:59.166041 14383 slave.cpp:1159] Queuing task '0' for executor default of
framework '20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.166225 14384 hierarchical_allocator_process.hpp:546] Framework 20140618-194759-16842879-51994-14356-0000
left cpus(*):1; mem(*):512; disk(*):1024; ports(*):[31000-32000] unused on slave 20140618-194759-16842879-51994-14356-0
> I0618 19:47:59.166496 14384 hierarchical_allocator_process.hpp:588] Framework 20140618-194759-16842879-51994-14356-0000
filtered slave 20140618-194759-16842879-51994-14356-0 for 5secs
> I0618 19:47:59.166705 14383 slave.cpp:498] Successfully attached file '/tmp/FaultToleranceTest_SlaveReregisterTerminatedExecutor_vvem0g/slaves/20140618-194759-16842879-51994-14356-0/frameworks/20140618-194759-16842879-51994-14356-0000/executors/default/runs/467f509d-2c84-49cb-bf50-105b9526d194'
> I0618 19:47:59.167057 14383 slave.cpp:1670] Got registration for executor 'default' of
framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.167529 14378 exec.cpp:205] Executor registered on slave 20140618-194759-16842879-51994-14356-0
> I0618 19:47:59.168632 14378 exec.cpp:217] Executor::registered took 17363ns
> I0618 19:47:59.168958 14383 slave.cpp:1789] Flushing queued task 0 for executor 'default'
of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.169570 14382 exec.cpp:292] Executor asked to run task '0'
> I0618 19:47:59.169610 14382 exec.cpp:301] Executor::launchTask took 22314ns
> I0618 19:47:59.170557 14382 exec.cpp:524] Executor sending status update TASK_RUNNING
(UUID: c548cdab-a25b-46d5-9397-21a32a6b2677) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.170912 14383 slave.cpp:2405] Monitoring executor 'default' of framework
'20140618-194759-16842879-51994-14356-0000' in container '467f509d-2c84-49cb-bf50-105b9526d194'
> I0618 19:47:59.171341 14383 slave.cpp:2024] Handling status update TASK_RUNNING (UUID:
c548cdab-a25b-46d5-9397-21a32a6b2677) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
from executor(39)@127.0.1.1:51994
> I0618 19:47:59.171689 14383 status_update_manager.cpp:320] Received status update TASK_RUNNING
(UUID: c548cdab-a25b-46d5-9397-21a32a6b2677) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.171977 14383 status_update_manager.cpp:499] Creating StatusUpdate stream
for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.172199 14383 status_update_manager.cpp:373] Forwarding status update TASK_RUNNING
(UUID: c548cdab-a25b-46d5-9397-21a32a6b2677) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
to master@127.0.1.1:51994
> I0618 19:47:59.172335 14383 master.cpp:3062] Status update TASK_RUNNING (UUID: c548cdab-a25b-46d5-9397-21a32a6b2677)
for task 0 of framework 20140618-194759-16842879-51994-14356-0000 from slave 20140618-194759-16842879-51994-14356-0
at slave(159)@127.0.1.1:51994 (saucy)
> I0618 19:47:59.172369 14383 slave.cpp:2182] Status update manager successfully handled
status update TASK_RUNNING (UUID: c548cdab-a25b-46d5-9397-21a32a6b2677) for task 0 of framework
20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.172379 14383 slave.cpp:2188] Sending acknowledgement for status update
TASK_RUNNING (UUID: c548cdab-a25b-46d5-9397-21a32a6b2677) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
to executor(39)@127.0.1.1:51994
> I0618 19:47:59.172449 14383 sched.cpp:637] Scheduler::statusUpdate took 16331ns
> I0618 19:47:59.172487 14383 exec.cpp:338] Executor received status update acknowledgement
c548cdab-a25b-46d5-9397-21a32a6b2677 for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.172541 14383 master.cpp:2586] Forwarding status update acknowledgement
c548cdab-a25b-46d5-9397-21a32a6b2677 for task 0 of framework 20140618-194759-16842879-51994-14356-0000
to slave 20140618-194759-16842879-51994-14356-0 at slave(159)@127.0.1.1:51994 (saucy)
> I0618 19:47:59.172654 14383 status_update_manager.cpp:398] Received status update acknowledgement
(UUID: c548cdab-a25b-46d5-9397-21a32a6b2677) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.173964 14384 exec.cpp:524] Executor sending status update TASK_FINISHED
(UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.174052 14384 slave.cpp:2024] Handling status update TASK_FINISHED (UUID:
3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
from executor(39)@127.0.1.1:51994
> I0618 19:47:59.174074 14384 slave.cpp:3710] Terminating task 0
> I0618 19:47:59.174628 14377 slave.cpp:1610] Status update manager successfully handled
status update acknowledgement (UUID: c548cdab-a25b-46d5-9397-21a32a6b2677) for task 0 of framework
20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.174855 14383 status_update_manager.cpp:320] Received status update TASK_FINISHED
(UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.175153 14383 status_update_manager.cpp:373] Forwarding status update TASK_FINISHED
(UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
to master@127.0.1.1:51994
> I0618 19:47:59.175921 14384 slave.cpp:2463] Executor 'default' of framework 20140618-194759-16842879-51994-14356-0000
has exited with status 0
> I0618 19:47:59.176041 14384 master.cpp:3138] Executor default of framework 20140618-194759-16842879-51994-14356-0000
on slave 20140618-194759-16842879-51994-14356-0 at slave(159)@127.0.1.1:51994 (saucy) has
exited with status 0
> I0618 19:47:59.177268 14380 slave.cpp:2182] Status update manager successfully handled
status update TASK_FINISHED (UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework
20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.177288 14380 slave.cpp:2188] Sending acknowledgement for status update
TASK_FINISHED (UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
to executor(39)@127.0.1.1:51994
> I0618 19:47:59.177634 14377 slave.cpp:537] New master detected at master@127.0.1.1:51994
> I0618 19:47:59.177661 14377 slave.cpp:613] Authenticating with master master@127.0.1.1:51994
> I0618 19:47:59.177700 14377 slave.cpp:586] Detecting new master
> I0618 19:47:59.177737 14377 status_update_manager.cpp:167] New master detected at master@127.0.1.1:51994
> W0618 19:47:59.177749 14377 status_update_manager.cpp:181] Resending status update TASK_FINISHED
(UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.177758 14377 status_update_manager.cpp:373] Forwarding status update TASK_FINISHED
(UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
to master@127.0.1.1:51994
> I0618 19:47:59.177813 14377 authenticatee.hpp:128] Creating new client SASL connection
> I0618 19:47:59.177983 14377 master.cpp:3062] Status update TASK_FINISHED (UUID: 3ab0d956-31f3-411b-b226-05aaa873756f)
for task 0 of framework 20140618-194759-16842879-51994-14356-0000 from slave 20140618-194759-16842879-51994-14356-0
at slave(159)@127.0.1.1:51994 (saucy)
> I0618 19:47:59.178027 14377 master.hpp:775] Removing task 0 with resources cpus(*):1;
mem(*):512 on slave 20140618-194759-16842879-51994-14356-0 (saucy)
> I0618 19:47:59.178112 14377 master.cpp:1557] Disconnecting slave 20140618-194759-16842879-51994-14356-0
> I0618 19:47:59.178133 14377 master.cpp:3457] Authenticating slave(159)@127.0.1.1:51994
> I0618 19:47:59.178231 14377 sched.cpp:637] Scheduler::statusUpdate took 9955ns
> I0618 19:47:59.178295 14377 hierarchical_allocator_process.hpp:635] Recovered cpus(*):1;
mem(*):512 (total allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000])
on slave 20140618-194759-16842879-51994-14356-0 from framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.178326 14377 hierarchical_allocator_process.hpp:483] Slave 20140618-194759-16842879-51994-14356-0
disconnected
> I0618 19:47:59.178352 14377 authenticator.hpp:156] Creating new server SASL connection
> W0618 19:47:59.178462 14377 master.cpp:2578] Cannot send status update acknowledgement
message for task 0 of framework 20140618-194759-16842879-51994-14356-0000 to slave 20140618-194759-16842879-51994-14356-0
at slave(159)@127.0.1.1:51994 (saucy) because slave is disconnected
> I0618 19:47:59.178495 14377 authenticatee.hpp:219] Received SASL authentication mechanisms:
CRAM-MD5
> I0618 19:47:59.178509 14377 authenticatee.hpp:245] Attempting to authenticate with mechanism
'CRAM-MD5'
> I0618 19:47:59.178531 14377 authenticator.hpp:262] Received SASL authentication start
> I0618 19:47:59.178567 14377 authenticator.hpp:384] Authentication requires more steps
> I0618 19:47:59.178592 14377 authenticatee.hpp:265] Received SASL authentication step
> I0618 19:47:59.178624 14377 authenticator.hpp:290] Received SASL authentication step
> I0618 19:47:59.178645 14377 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'saucy' server FQDN: 'saucy' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: false 
> I0618 19:47:59.178653 14377 auxprop.cpp:153] Looking up auxiliary property '*userPassword'
> I0618 19:47:59.178664 14377 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
> I0618 19:47:59.178673 14377 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'saucy' server FQDN: 'saucy' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: true 
> I0618 19:47:59.178678 14377 auxprop.cpp:103] Skipping auxiliary property '*userPassword'
since SASL_AUXPROP_AUTHZID == true
> I0618 19:47:59.178683 14377 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5'
since SASL_AUXPROP_AUTHZID == true
> I0618 19:47:59.178694 14377 authenticator.hpp:376] Authentication success
> I0618 19:47:59.178719 14377 authenticatee.hpp:305] Authentication success
> I0618 19:47:59.178858 14377 slave.cpp:670] Successfully authenticated with master master@127.0.1.1:51994
> I0618 19:47:59.178920 14377 slave.cpp:908] Will retry registration in 2.836978ms if necessary
> I0618 19:47:59.178797 14382 master.cpp:3497] Successfully authenticated principal 'test-principal'
at slave(159)@127.0.1.1:51994
> W0618 19:47:59.179344 14382 master.cpp:2851] Slave at slave(159)@127.0.1.1:51994 (saucy)
is being allowed to re-register with an already in use id (20140618-194759-16842879-51994-14356-0)
> I0618 19:47:59.179441 14382 hierarchical_allocator_process.hpp:497] Slave 20140618-194759-16842879-51994-14356-0
reconnected
> I0618 19:47:59.179579 14377 slave.cpp:754] Re-registered with master master@127.0.1.1:51994
> I0618 19:47:59.179607 14377 slave.cpp:1520] Updating framework 20140618-194759-16842879-51994-14356-0000
pid to scheduler-be3b06f5-f171-4a1b-be80-e364044f4856@127.0.1.1:51994
> W0618 19:47:59.179898 14377 status_update_manager.cpp:181] Resending status update TASK_FINISHED
(UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.179919 14377 status_update_manager.cpp:373] Forwarding status update TASK_FINISHED
(UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
to master@127.0.1.1:51994
> W0618 19:47:59.180007 14377 master.cpp:3055] Could not lookup task for status update
TASK_FINISHED (UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
from slave 20140618-194759-16842879-51994-14356-0 at slave(159)@127.0.1.1:51994 (saucy)
> tests/fault_tolerance_tests.cpp:1933: Failure
> Mock function called more times than expected - returning directly.
>     Function call: statusUpdate(0x7fffb181c620, @0x2b03bc086d10 80-byte object <F0-5C
65-9E 03-2B 00-00 00-00 00-00 00-00 00-00 E0-44 0A-BC 03-2B 00-00 38-C1 DA-00 00-00 00-00
38-C1 DA-00 00-00 00-00 30-2E 08-BC 03-2B 00-00 02-00 00-00 00-00 00-00 00-00 00-00 00-00
00-00 40-13 4B-00 E5-EA D4-41 00-00 00-00 53-00 00-00>)
>          Expected: to be called once
>            Actual: called twice - over-saturated and active
> I0618 19:47:59.180203 14377 sched.cpp:637] Scheduler::statusUpdate took 144974ns
> I0618 19:47:59.180250 14377 master.cpp:2586] Forwarding status update acknowledgement
3ab0d956-31f3-411b-b226-05aaa873756f for task 0 of framework 20140618-194759-16842879-51994-14356-0000
to slave 20140618-194759-16842879-51994-14356-0 at slave(159)@127.0.1.1:51994 (saucy)
> I0618 19:47:59.180307 14377 status_update_manager.cpp:398] Received status update acknowledgement
(UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.180330 14377 status_update_manager.cpp:530] Cleaning up status update
stream for task 0 of framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.180393 14377 slave.cpp:1610] Status update manager successfully handled
status update acknowledgement (UUID: 3ab0d956-31f3-411b-b226-05aaa873756f) for task 0 of framework
20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.180408 14377 slave.cpp:3752] Completing task 0
> I0618 19:47:59.180418 14377 slave.cpp:2602] Cleaning up executor 'default' of framework
20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.180711 14382 gc.cpp:56] Scheduling '/tmp/FaultToleranceTest_SlaveReregisterTerminatedExecutor_vvem0g/slaves/20140618-194759-16842879-51994-14356-0/frameworks/20140618-194759-16842879-51994-14356-0000/executors/default/runs/467f509d-2c84-49cb-bf50-105b9526d194'
for gc 6.99999791125037days in the future
> I0618 19:47:59.180878 14382 gc.cpp:56] Scheduling '/tmp/FaultToleranceTest_SlaveReregisterTerminatedExecutor_vvem0g/slaves/20140618-194759-16842879-51994-14356-0/frameworks/20140618-194759-16842879-51994-14356-0000/executors/default'
for gc 6.99999791082074days in the future
> I0618 19:47:59.181033 14377 slave.cpp:2677] Cleaning up framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.181128 14377 status_update_manager.cpp:282] Closing status update streams
for framework 20140618-194759-16842879-51994-14356-0000
> I0618 19:47:59.181146 14377 gc.cpp:56] Scheduling '/tmp/FaultToleranceTest_SlaveReregisterTerminatedExecutor_vvem0g/slaves/20140618-194759-16842879-51994-14356-0/frameworks/20140618-194759-16842879-51994-14356-0000'
for gc 6.99999790392296days in the future
> I0618 19:47:59.181380 14356 master.cpp:610] Master terminating
> I0618 19:47:59.181586 14379 sched.cpp:747] Stopping framework '20140618-194759-16842879-51994-14356-0000'
> I0618 19:47:59.188258 14379 slave.cpp:2267] master@127.0.1.1:51994 exited
> W0618 19:47:59.188284 14379 slave.cpp:2270] Master disconnected! Waiting for a new master
to be elected
> I0618 19:47:59.190114 14356 slave.cpp:426] Slave terminating
> [  FAILED  ] FaultToleranceTest.SlaveReregisterTerminatedExecutor (108 ms)
> {{noformat}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message