mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Mahler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-2228) SlaveTest.MesosExecutorGracefulShutdown is flaky
Date Tue, 03 Feb 2015 17:45:34 GMT

    [ https://issues.apache.org/jira/browse/MESOS-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303658#comment-14303658
] 

Benjamin Mahler commented on MESOS-2228:
----------------------------------------

That's ok, I have some cleanups that I'll send your way as well. But a test that can catch
this would be great!

> SlaveTest.MesosExecutorGracefulShutdown is flaky
> ------------------------------------------------
>
>                 Key: MESOS-2228
>                 URL: https://issues.apache.org/jira/browse/MESOS-2228
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 0.22.0
>            Reporter: Vinod Kone
>            Assignee: Benjamin Mahler
>              Labels: twitter
>             Fix For: 0.22.0
>
>
> Observed this on internal CI
> {noformat}
> [ RUN      ] SlaveTest.MesosExecutorGracefulShutdown
> Using temporary directory '/tmp/SlaveTest_MesosExecutorGracefulShutdown_AWdtVJ'
> I0124 08:14:04.399211  7926 leveldb.cpp:176] Opened db in 27.364056ms
> I0124 08:14:04.402632  7926 leveldb.cpp:183] Compacted db in 3.357646ms
> I0124 08:14:04.402691  7926 leveldb.cpp:198] Created db iterator in 23822ns
> I0124 08:14:04.402708  7926 leveldb.cpp:204] Seeked to beginning of db in 1913ns
> I0124 08:14:04.402716  7926 leveldb.cpp:273] Iterated through 0 keys in the db in 458ns
> I0124 08:14:04.402767  7926 replica.cpp:744] Replica recovered with log positions 0 ->
0 with 1 holes and 0 unlearned
> I0124 08:14:04.403728  7951 recover.cpp:449] Starting replica recovery
> I0124 08:14:04.404011  7951 recover.cpp:475] Replica is in EMPTY status
> I0124 08:14:04.407765  7950 replica.cpp:641] Replica in EMPTY status received a broadcasted
recover request
> I0124 08:14:04.408710  7951 recover.cpp:195] Received a recover response from a replica
in EMPTY status
> I0124 08:14:04.419666  7951 recover.cpp:566] Updating replica status to STARTING
> I0124 08:14:04.429719  7953 master.cpp:262] Master 20150124-081404-16842879-47787-7926
(utopic) started on 127.0.1.1:47787
> I0124 08:14:04.429790  7953 master.cpp:308] Master only allowing authenticated frameworks
to register
> I0124 08:14:04.429802  7953 master.cpp:313] Master only allowing authenticated slaves
to register
> I0124 08:14:04.429826  7953 credentials.hpp:36] Loading credentials for authentication
from '/tmp/SlaveTest_MesosExecutorGracefulShutdown_AWdtVJ/credentials'
> I0124 08:14:04.430277  7953 master.cpp:357] Authorization enabled
> I0124 08:14:04.432682  7953 master.cpp:1219] The newly elected leader is master@127.0.1.1:47787
with id 20150124-081404-16842879-47787-7926
> I0124 08:14:04.432816  7953 master.cpp:1232] Elected as the leading master!
> I0124 08:14:04.432894  7953 master.cpp:1050] Recovering from registrar
> I0124 08:14:04.433212  7950 registrar.cpp:313] Recovering registrar
> I0124 08:14:04.434226  7951 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb
took 14.323302ms
> I0124 08:14:04.434270  7951 replica.cpp:323] Persisted replica status to STARTING
> I0124 08:14:04.434489  7951 recover.cpp:475] Replica is in STARTING status
> I0124 08:14:04.436164  7951 replica.cpp:641] Replica in STARTING status received a broadcasted
recover request
> I0124 08:14:04.439368  7947 recover.cpp:195] Received a recover response from a replica
in STARTING status
> I0124 08:14:04.440626  7947 recover.cpp:566] Updating replica status to VOTING
> I0124 08:14:04.443667  7947 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb
took 2.698664ms
> I0124 08:14:04.443759  7947 replica.cpp:323] Persisted replica status to VOTING
> I0124 08:14:04.443925  7947 recover.cpp:580] Successfully joined the Paxos group
> I0124 08:14:04.444160  7947 recover.cpp:464] Recover process terminated
> I0124 08:14:04.444543  7949 log.cpp:660] Attempting to start the writer
> I0124 08:14:04.446331  7949 replica.cpp:477] Replica received implicit promise request
with proposal 1
> I0124 08:14:04.449329  7949 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb
took 2.690453ms
> I0124 08:14:04.449388  7949 replica.cpp:345] Persisted promised to 1
> I0124 08:14:04.450637  7947 coordinator.cpp:230] Coordinator attemping to fill missing
position
> I0124 08:14:04.452271  7949 replica.cpp:378] Replica received explicit promise request
for position 0 with proposal 2
> I0124 08:14:04.455124  7949 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took
2.593522ms
> I0124 08:14:04.455157  7949 replica.cpp:679] Persisted action at 0
> I0124 08:14:04.456594  7951 replica.cpp:511] Replica received write request for position
0
> I0124 08:14:04.456657  7951 leveldb.cpp:438] Reading position from leveldb took 30358ns
> I0124 08:14:04.464860  7951 leveldb.cpp:343] Persisting action (14 bytes) to leveldb
took 8.164646ms
> I0124 08:14:04.464903  7951 replica.cpp:679] Persisted action at 0
> I0124 08:14:04.465947  7949 replica.cpp:658] Replica received learned notice for position
0
> I0124 08:14:04.471567  7949 leveldb.cpp:343] Persisting action (16 bytes) to leveldb
took 5.587838ms
> I0124 08:14:04.471601  7949 replica.cpp:679] Persisted action at 0
> I0124 08:14:04.471622  7949 replica.cpp:664] Replica learned NOP action at position 0
> I0124 08:14:04.472682  7951 log.cpp:676] Writer started with ending position 0
> I0124 08:14:04.473919  7951 leveldb.cpp:438] Reading position from leveldb took 28676ns
> I0124 08:14:04.491591  7951 registrar.cpp:346] Successfully fetched the registry (0B)
in 58.337024ms
> I0124 08:14:04.491704  7951 registrar.cpp:445] Applied 1 operations in 28163ns; attempting
to update the 'registry'
> I0124 08:14:04.493938  7953 log.cpp:684] Attempting to append 118 bytes to the log
> I0124 08:14:04.494122  7953 coordinator.cpp:340] Coordinator attempting to write APPEND
action at position 1
> I0124 08:14:04.495069  7953 replica.cpp:511] Replica received write request for position
1
> I0124 08:14:04.500089  7953 leveldb.cpp:343] Persisting action (135 bytes) to leveldb
took 4.989356ms
> I0124 08:14:04.500123  7953 replica.cpp:679] Persisted action at 1
> I0124 08:14:04.501271  7950 replica.cpp:658] Replica received learned notice for position
1
> I0124 08:14:04.505698  7950 leveldb.cpp:343] Persisting action (137 bytes) to leveldb
took 4.396221ms
> I0124 08:14:04.505734  7950 replica.cpp:679] Persisted action at 1
> I0124 08:14:04.505755  7950 replica.cpp:664] Replica learned APPEND action at position
1
> I0124 08:14:04.507313  7950 registrar.cpp:490] Successfully updated the 'registry' in
15.52896ms
> I0124 08:14:04.507478  7953 log.cpp:703] Attempting to truncate the log to 1
> I0124 08:14:04.507848  7953 coordinator.cpp:340] Coordinator attempting to write TRUNCATE
action at position 2
> I0124 08:14:04.508743  7953 replica.cpp:511] Replica received write request for position
2
> I0124 08:14:04.509214  7950 registrar.cpp:376] Successfully recovered registrar
> I0124 08:14:04.509682  7946 master.cpp:1077] Recovered 0 slaves from the Registry (82B)
; allowing 10mins for slaves to re-register
> I0124 08:14:04.514654  7953 leveldb.cpp:343] Persisting action (16 bytes) to leveldb
took 5.880031ms
> I0124 08:14:04.514689  7953 replica.cpp:679] Persisted action at 2
> I0124 08:14:04.515736  7953 replica.cpp:658] Replica received learned notice for position
2
> I0124 08:14:04.522014  7953 leveldb.cpp:343] Persisting action (18 bytes) to leveldb
took 6.245138ms
> I0124 08:14:04.522086  7953 leveldb.cpp:401] Deleting ~1 keys from leveldb took 37803ns
> I0124 08:14:04.522107  7953 replica.cpp:679] Persisted action at 2
> I0124 08:14:04.522128  7953 replica.cpp:664] Replica learned TRUNCATE action at position
2
> I0124 08:14:04.531460  7926 containerizer.cpp:103] Using isolation: posix/cpu,posix/mem
> I0124 08:14:04.547194  7951 slave.cpp:173] Slave started on 208)@127.0.1.1:47787
> I0124 08:14:04.555682  7951 credentials.hpp:84] Loading credential for authentication
from '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/credential'
> I0124 08:14:04.556622  7951 slave.cpp:282] Slave using credential for: test-principal
> I0124 08:14:04.557052  7951 slave.cpp:300] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000]
> I0124 08:14:04.557842  7951 slave.cpp:329] Slave hostname: utopic
> I0124 08:14:04.558091  7951 slave.cpp:330] Slave checkpoint: false
> W0124 08:14:04.558352  7951 slave.cpp:332] Disabling checkpointing is deprecated and
the --checkpoint flag will be removed in a future release. Please avoid using this flag
> I0124 08:14:04.566864  7948 state.cpp:33] Recovering state from '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/meta'
> I0124 08:14:04.575711  7951 status_update_manager.cpp:197] Recovering status update manager
> I0124 08:14:04.575904  7951 containerizer.cpp:300] Recovering containerizer
> I0124 08:14:04.577112  7951 slave.cpp:3519] Finished recovery
> I0124 08:14:04.577374  7926 sched.cpp:151] Version: 0.22.0
> I0124 08:14:04.578663  7950 sched.cpp:248] New master detected at master@127.0.1.1:47787
> I0124 08:14:04.578759  7950 sched.cpp:304] Authenticating with master master@127.0.1.1:47787
> I0124 08:14:04.578781  7950 sched.cpp:311] Using default CRAM-MD5 authenticatee
> I0124 08:14:04.579071  7950 authenticatee.hpp:138] Creating new client SASL connection
> I0124 08:14:04.579550  7947 master.cpp:4129] Authenticating scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787
> I0124 08:14:04.579582  7947 master.cpp:4140] Using default CRAM-MD5 authenticator
> I0124 08:14:04.580031  7947 authenticator.hpp:170] Creating new server SASL connection
> I0124 08:14:04.580402  7947 authenticatee.hpp:229] Received SASL authentication mechanisms:
CRAM-MD5
> I0124 08:14:04.580430  7947 authenticatee.hpp:255] Attempting to authenticate with mechanism
'CRAM-MD5'
> I0124 08:14:04.580538  7947 authenticator.hpp:276] Received SASL authentication start
> I0124 08:14:04.580581  7947 authenticator.hpp:398] Authentication requires more steps
> I0124 08:14:04.580651  7947 authenticatee.hpp:275] Received SASL authentication step
> I0124 08:14:04.580746  7947 authenticator.hpp:304] Received SASL authentication step
> I0124 08:14:04.580837  7947 authenticator.hpp:390] Authentication success
> I0124 08:14:04.580940  7947 authenticatee.hpp:315] Authentication success
> I0124 08:14:04.581009  7947 master.cpp:4187] Successfully authenticated principal 'test-principal'
at scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787
> I0124 08:14:04.581328  7947 sched.cpp:392] Successfully authenticated with master master@127.0.1.1:47787
> I0124 08:14:04.581509  7947 master.cpp:1420] Received registration request for framework
'default' at scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787
> I0124 08:14:04.581585  7947 master.cpp:1298] Authorizing framework principal 'test-principal'
to receive offers for role '*'
> I0124 08:14:04.582033  7947 master.cpp:1484] Registering framework 20150124-081404-16842879-47787-7926-0000
(default) at scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787
> I0124 08:14:04.582595  7947 hierarchical_allocator_process.hpp:319] Added framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:04.583051  7947 sched.cpp:442] Framework registered with 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:04.584087  7951 slave.cpp:613] New master detected at master@127.0.1.1:47787
> I0124 08:14:04.584388  7951 slave.cpp:676] Authenticating with master master@127.0.1.1:47787
> I0124 08:14:04.584564  7951 slave.cpp:681] Using default CRAM-MD5 authenticatee
> I0124 08:14:04.584951  7951 slave.cpp:649] Detecting new master
> I0124 08:14:04.585219  7951 status_update_manager.cpp:171] Pausing sending status updates
> I0124 08:14:04.585604  7951 authenticatee.hpp:138] Creating new client SASL connection
> I0124 08:14:04.587666  7953 master.cpp:4129] Authenticating slave(208)@127.0.1.1:47787
> I0124 08:14:04.587702  7953 master.cpp:4140] Using default CRAM-MD5 authenticator
> I0124 08:14:04.588434  7953 authenticator.hpp:170] Creating new server SASL connection
> I0124 08:14:04.588764  7953 authenticatee.hpp:229] Received SASL authentication mechanisms:
CRAM-MD5
> I0124 08:14:04.588790  7953 authenticatee.hpp:255] Attempting to authenticate with mechanism
'CRAM-MD5'
> I0124 08:14:04.588896  7953 authenticator.hpp:276] Received SASL authentication start
> I0124 08:14:04.588935  7953 authenticator.hpp:398] Authentication requires more steps
> I0124 08:14:04.589005  7953 authenticatee.hpp:275] Received SASL authentication step
> I0124 08:14:04.589082  7953 authenticator.hpp:304] Received SASL authentication step
> I0124 08:14:04.589140  7953 authenticator.hpp:390] Authentication success
> I0124 08:14:04.589232  7953 authenticatee.hpp:315] Authentication success
> I0124 08:14:04.589300  7953 master.cpp:4187] Successfully authenticated principal 'test-principal'
at slave(208)@127.0.1.1:47787
> I0124 08:14:04.589587  7953 slave.cpp:747] Successfully authenticated with master master@127.0.1.1:47787
> I0124 08:14:04.589913  7953 master.cpp:3275] Registering slave at slave(208)@127.0.1.1:47787
(utopic) with id 20150124-081404-16842879-47787-7926-S0
> I0124 08:14:04.590322  7953 registrar.cpp:445] Applied 1 operations in 60404ns; attempting
to update the 'registry'
> I0124 08:14:04.595336  7948 log.cpp:684] Attempting to append 283 bytes to the log
> I0124 08:14:04.595552  7948 coordinator.cpp:340] Coordinator attempting to write APPEND
action at position 3
> I0124 08:14:04.596535  7948 replica.cpp:511] Replica received write request for position
3
> I0124 08:14:04.597846  7951 master.cpp:3263] Ignoring register slave message from slave(208)@127.0.1.1:47787
(utopic) as admission is already in progress
> I0124 08:14:04.602326  7948 leveldb.cpp:343] Persisting action (302 bytes) to leveldb
took 5.758211ms
> I0124 08:14:04.602363  7948 replica.cpp:679] Persisted action at 3
> I0124 08:14:04.603492  7951 replica.cpp:658] Replica received learned notice for position
3
> I0124 08:14:04.608952  7951 leveldb.cpp:343] Persisting action (304 bytes) to leveldb
took 5.427195ms
> I0124 08:14:04.608985  7951 replica.cpp:679] Persisted action at 3
> I0124 08:14:04.609007  7951 replica.cpp:664] Replica learned APPEND action at position
3
> I0124 08:14:04.610643  7951 registrar.cpp:490] Successfully updated the 'registry' in
20.258048ms
> I0124 08:14:04.610800  7948 log.cpp:703] Attempting to truncate the log to 3
> I0124 08:14:04.611184  7948 coordinator.cpp:340] Coordinator attempting to write TRUNCATE
action at position 4
> I0124 08:14:04.612076  7948 replica.cpp:511] Replica received write request for position
4
> I0124 08:14:04.613061  7946 master.cpp:3329] Registered slave 20150124-081404-16842879-47787-7926-S0
at slave(208)@127.0.1.1:47787 (utopic) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0124 08:14:04.613299  7946 hierarchical_allocator_process.hpp:453] Added slave 20150124-081404-16842879-47787-7926-S0
(utopic) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
> I0124 08:14:04.613688  7946 slave.cpp:781] Registered with master master@127.0.1.1:47787;
given slave ID 20150124-081404-16842879-47787-7926-S0
> I0124 08:14:04.614112  7946 master.cpp:4071] Sending 1 offers to framework 20150124-081404-16842879-47787-7926-0000
(default) at scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787
> I0124 08:14:04.614228  7946 status_update_manager.cpp:178] Resuming sending status updates
> I0124 08:14:04.617481  7947 master.cpp:2677] Processing ACCEPT call for offers: [ 20150124-081404-16842879-47787-7926-O0
] on slave 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic) for
framework 20150124-081404-16842879-47787-7926-0000 (default) at scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787
> I0124 08:14:04.617535  7947 master.cpp:2513] Authorizing framework principal 'test-principal'
to launch task 7c16772d-4aed-4719-81c4-658a2cc22543 as user 'jenkins'
> I0124 08:14:04.618736  7947 master.hpp:782] Adding task 7c16772d-4aed-4719-81c4-658a2cc22543
with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20150124-081404-16842879-47787-7926-S0
(utopic)
> I0124 08:14:04.618854  7947 master.cpp:2885] Launching task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000 (default) at scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787
with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20150124-081404-16842879-47787-7926-S0
at slave(208)@127.0.1.1:47787 (utopic)
> I0124 08:14:04.619209  7947 slave.cpp:1130] Got assigned task 7c16772d-4aed-4719-81c4-658a2cc22543
for framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:04.619472  7948 leveldb.cpp:343] Persisting action (16 bytes) to leveldb
took 7.364828ms
> I0124 08:14:04.619941  7948 replica.cpp:679] Persisted action at 4
> I0124 08:14:04.624851  7953 replica.cpp:658] Replica received learned notice for position
4
> I0124 08:14:04.625757  7947 slave.cpp:1245] Launching task 7c16772d-4aed-4719-81c4-658a2cc22543
for framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:04.630590  7953 leveldb.cpp:343] Persisting action (18 bytes) to leveldb
took 5.705336ms
> I0124 08:14:04.630805  7953 leveldb.cpp:401] Deleting ~2 keys from leveldb took 51263ns
> I0124 08:14:04.630828  7953 replica.cpp:679] Persisted action at 4
> I0124 08:14:04.630851  7953 replica.cpp:664] Replica learned TRUNCATE action at position
4
> I0124 08:14:04.633968  7947 slave.cpp:3921] Launching executor 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000 in work directory '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/slaves/20150124-081404-16842879-47787-7926-S0/frameworks/20150124-081404-16842879-47787-7926-0000/executors/7c16772d-4aed-4719-81c4-658a2cc22543/runs/53887a08-f11d-4a2f-a659-a715d9fcf3d2'
> I0124 08:14:04.634963  7951 containerizer.cpp:445] Starting container '53887a08-f11d-4a2f-a659-a715d9fcf3d2'
for executor '7c16772d-4aed-4719-81c4-658a2cc22543' of framework '20150124-081404-16842879-47787-7926-0000'
> W0124 08:14:04.636931  7951 containerizer.cpp:296] CommandInfo.grace_period flag is not
set, using default value: 3secs
> I0124 08:14:04.655591  7947 slave.cpp:1368] Queuing task '7c16772d-4aed-4719-81c4-658a2cc22543'
for executor 7c16772d-4aed-4719-81c4-658a2cc22543 of framework '20150124-081404-16842879-47787-7926-0000
> I0124 08:14:04.656992  7951 launcher.cpp:137] Forked child with pid '11030' for container
'53887a08-f11d-4a2f-a659-a715d9fcf3d2'
> I0124 08:14:04.673646  7951 slave.cpp:2890] Monitoring executor '7c16772d-4aed-4719-81c4-658a2cc22543'
of framework '20150124-081404-16842879-47787-7926-0000' in container '53887a08-f11d-4a2f-a659-a715d9fcf3d2'
> I0124 08:14:04.964946 11044 exec.cpp:147] Version: 0.22.0
> I0124 08:14:05.113059  7948 slave.cpp:1912] Got registration for executor '7c16772d-4aed-4719-81c4-658a2cc22543'
of framework 20150124-081404-16842879-47787-7926-0000 from executor(1)@127.0.1.1:49174
> I0124 08:14:05.121086  7948 slave.cpp:2031] Flushing queued task 7c16772d-4aed-4719-81c4-658a2cc22543
for executor '7c16772d-4aed-4719-81c4-658a2cc22543' of framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:05.266849 11062 exec.cpp:221] Executor registered on slave 20150124-081404-16842879-47787-7926-S0
> Shutdown timeout is set to 3secsRegistered executor on utopic
> Starting task 7c16772d-4aed-4719-81c4-658a2cc22543
> Forked command at 11067
> sh -c 'sleep 1000'
> I0124 08:14:05.492084  7953 slave.cpp:2265] Handling status update TASK_RUNNING (UUID:
54742a87-ef02-4e72-a19b-83b0eeb62568) for task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework
20150124-081404-16842879-47787-7926-0000 from executor(1)@127.0.1.1:49174
> I0124 08:14:05.492805  7953 status_update_manager.cpp:317] Received status update TASK_RUNNING
(UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568) for task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:05.493762  7953 slave.cpp:2508] Forwarding the update TASK_RUNNING (UUID:
54742a87-ef02-4e72-a19b-83b0eeb62568) for task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework
20150124-081404-16842879-47787-7926-0000 to master@127.0.1.1:47787
> I0124 08:14:05.493948  7953 slave.cpp:2441] Sending acknowledgement for status update
TASK_RUNNING (UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568) for task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000 to executor(1)@127.0.1.1:49174
> I0124 08:14:05.495378  7949 master.cpp:3652] Forwarding status update TASK_RUNNING (UUID:
54742a87-ef02-4e72-a19b-83b0eeb62568) for task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework
20150124-081404-16842879-47787-7926-0000
> I0124 08:14:05.495584  7949 master.cpp:3624] Status update TASK_RUNNING (UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568)
for task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework 20150124-081404-16842879-47787-7926-0000
from slave 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic)
> I0124 08:14:05.495678  7949 master.cpp:4934] Updating the latest state of task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000 to TASK_RUNNING
> I0124 08:14:05.496422  7949 master.cpp:3125] Forwarding status update acknowledgement
54742a87-ef02-4e72-a19b-83b0eeb62568 for task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework
20150124-081404-16842879-47787-7926-0000 (default) at scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787
to slave 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic)
> I0124 08:14:05.497735  7946 master.cpp:2961] Asked to kill task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:05.497859  7946 master.cpp:3021] Telling slave 20150124-081404-16842879-47787-7926-S0
at slave(208)@127.0.1.1:47787 (utopic) to kill task 7c16772d-4aed-4719-81c4-658a2cc22543 of
framework 20150124-081404-16842879-47787-7926-0000 (default) at scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787
> I0124 08:14:05.498589  7947 status_update_manager.cpp:389] Received status update acknowledgement
(UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568) for task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:05.499006  7953 slave.cpp:1424] Asked to kill task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000
> Shutting down
> Sending SIGTERM to process tree at pid 11067
> Killing the following process trees:
> [ 
> -+- 11067 sh -c sleep 1000 
>  \--- 11068 sleep 1000 
> ]
> 2015-01-24 08:14:07,295:7926(0x7f30b1b34700):ZOO_ERROR@handle_socket_error_msg@1697:
Socket [127.0.0.1:57753] zk retcode=-4, errno=111(Connection refused): server refused to accept
the client
> Process 11067 did not terminate after 3secs, sending SIGKILL to process tree at 11067
> Killed the following process trees:
> [ 
> -+- 11067 sh -c sleep 1000 
>  \--- 11068 sleep 1000 
> ]
> Command terminated with signal Killed (pid: 11067)
> I0124 08:14:09.063453  7953 slave.cpp:2265] Handling status update TASK_KILLED (UUID:
4bd05372-2705-46e5-8182-5cb6907fbab3) for task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework
20150124-081404-16842879-47787-7926-0000 from executor(1)@127.0.1.1:49174
> I0124 08:14:09.069545  7953 status_update_manager.cpp:317] Received status update TASK_KILLED
(UUID: 4bd05372-2705-46e5-8182-5cb6907fbab3) for task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:09.070265  7953 slave.cpp:2508] Forwarding the update TASK_KILLED (UUID:
4bd05372-2705-46e5-8182-5cb6907fbab3) for task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework
20150124-081404-16842879-47787-7926-0000 to master@127.0.1.1:47787
> I0124 08:14:09.070996  7947 master.cpp:3652] Forwarding status update TASK_KILLED (UUID:
4bd05372-2705-46e5-8182-5cb6907fbab3) for task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework
20150124-081404-16842879-47787-7926-0000
> I0124 08:14:09.071182  7947 master.cpp:3624] Status update TASK_KILLED (UUID: 4bd05372-2705-46e5-8182-5cb6907fbab3)
for task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework 20150124-081404-16842879-47787-7926-0000
from slave 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic)
> I0124 08:14:09.071260  7947 master.cpp:4934] Updating the latest state of task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000 to TASK_KILLED
> I0124 08:14:09.072052  7947 hierarchical_allocator_process.hpp:653] Recovered cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total allocatable: cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000]) on slave 20150124-081404-16842879-47787-7926-S0 from
framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:09.072449  7947 master.cpp:4993] Removing task 7c16772d-4aed-4719-81c4-658a2cc22543
with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] of framework 20150124-081404-16842879-47787-7926-0000
on slave 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic)
> I0124 08:14:09.072700  7947 master.cpp:3125] Forwarding status update acknowledgement
4bd05372-2705-46e5-8182-5cb6907fbab3 for task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework
20150124-081404-16842879-47787-7926-0000 (default) at scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787
to slave 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic)
> ../../src/tests/slave_tests.cpp:1736: Failure
> Expected: (std::string::npos) != (statusKilled.get().message().find("Terminated")), actual:
18446744073709551615 vs 18446744073709551615
> I0124 08:14:09.073422  7926 sched.cpp:1471] Asked to stop the driver
> I0124 08:14:09.073629  7926 master.cpp:654] Master terminating
> I0124 08:14:09.075768  7950 sched.cpp:808] Stopping framework '20150124-081404-16842879-47787-7926-0000'
> I0124 08:14:09.079352  7953 slave.cpp:2441] Sending acknowledgement for status update
TASK_KILLED (UUID: 4bd05372-2705-46e5-8182-5cb6907fbab3) for task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000 to executor(1)@127.0.1.1:49174
> I0124 08:14:09.085199  7953 slave.cpp:2673] master@127.0.1.1:47787 exited
> W0124 08:14:09.085232  7953 slave.cpp:2676] Master disconnected! Waiting for a new master
to be elected
> I0124 08:14:09.085263  7953 status_update_manager.cpp:389] Received status update acknowledgement
(UUID: 4bd05372-2705-46e5-8182-5cb6907fbab3) for task 7c16772d-4aed-4719-81c4-658a2cc22543
of framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:09.120879  7946 containerizer.cpp:879] Destroying container '53887a08-f11d-4a2f-a659-a715d9fcf3d2'
> I0124 08:14:09.216553  7952 containerizer.cpp:1084] Executor for container '53887a08-f11d-4a2f-a659-a715d9fcf3d2'
has exited
> I0124 08:14:09.218641  7952 slave.cpp:2948] Executor '7c16772d-4aed-4719-81c4-658a2cc22543'
of framework 20150124-081404-16842879-47787-7926-0000 terminated with signal Killed
> I0124 08:14:09.218855  7952 slave.cpp:3057] Cleaning up executor '7c16772d-4aed-4719-81c4-658a2cc22543'
of framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:09.223268  7947 gc.cpp:56] Scheduling '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/slaves/20150124-081404-16842879-47787-7926-S0/frameworks/20150124-081404-16842879-47787-7926-0000/executors/7c16772d-4aed-4719-81c4-658a2cc22543/runs/53887a08-f11d-4a2f-a659-a715d9fcf3d2'
for gc 6.99999746482667days in the future
> I0124 08:14:09.224205  7947 gc.cpp:56] Scheduling '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/slaves/20150124-081404-16842879-47787-7926-S0/frameworks/20150124-081404-16842879-47787-7926-0000/executors/7c16772d-4aed-4719-81c4-658a2cc22543'
for gc 6.99999746293926days in the future
> I0124 08:14:09.227552  7952 slave.cpp:3136] Cleaning up framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:09.229786  7949 status_update_manager.cpp:279] Closing status update streams
for framework 20150124-081404-16842879-47787-7926-0000
> I0124 08:14:09.230849  7952 slave.cpp:495] Slave terminating
> I0124 08:14:09.230989  7952 gc.cpp:56] Scheduling '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/slaves/20150124-081404-16842879-47787-7926-S0/frameworks/20150124-081404-16842879-47787-7926-0000'
for gc 6.99999732935407days in the future
> [  FAILED  ] SlaveTest.MesosExecutorGracefulShutdown (4881 ms)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message