mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominic Hamon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-1439) SchedulerTest.MetricsEndpoint is flaky
Date Fri, 30 May 2014 17:09:03 GMT

    [ https://issues.apache.org/jira/browse/MESOS-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013951#comment-14013951
] 

Dominic Hamon commented on MESOS-1439:
--------------------------------------

https://reviews.apache.org/r/22089

> SchedulerTest.MetricsEndpoint is flaky
> --------------------------------------
>
>                 Key: MESOS-1439
>                 URL: https://issues.apache.org/jira/browse/MESOS-1439
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Yan Xu
>            Assignee: Dominic Hamon
>
> {noformat}
> [ RUN      ] SchedulerTest.MetricsEndpoint
> Using temporary directory '/tmp/SchedulerTest_MetricsEndpoint_MMfrD4'
> I0529 09:17:49.369159 10663 leveldb.cpp:176] Opened db in 35.064571ms
> I0529 09:17:49.380153 10663 leveldb.cpp:183] Compacted db in 10.423082ms
> I0529 09:17:49.380749 10663 leveldb.cpp:198] Created db iterator in 12713ns
> I0529 09:17:49.380978 10663 leveldb.cpp:204] Seeked to beginning of db in 1787ns
> I0529 09:17:49.381248 10663 leveldb.cpp:273] Iterated through 0 keys in the db in 7348ns
> I0529 09:17:49.381542 10663 replica.cpp:741] Replica recovered with log positions 0 ->
0 with 1 holes and 0 unlearned
> I0529 09:17:49.382098 10687 recover.cpp:425] Starting replica recovery
> I0529 09:17:49.382223 10687 recover.cpp:451] Replica is in EMPTY status
> I0529 09:17:49.382688 10687 replica.cpp:638] Replica in EMPTY status received a broadcasted
recover request
> I0529 09:17:49.382777 10687 recover.cpp:188] Received a recover response from a replica
in EMPTY status
> I0529 09:17:49.382912 10687 recover.cpp:542] Updating replica status to STARTING
> I0529 09:17:49.387364 10688 master.cpp:272] Master 20140529-091749-16842879-52356-10663
(quantal) started on 127.0.1.1:52356
> I0529 09:17:49.387398 10688 master.cpp:309] Master only allowing authenticated frameworks
to register
> I0529 09:17:49.387405 10688 master.cpp:314] Master only allowing authenticated slaves
to register
> I0529 09:17:49.387413 10688 credentials.hpp:35] Loading credentials for authentication
from '/tmp/SchedulerTest_MetricsEndpoint_MMfrD4/credentials'
> I0529 09:17:49.387496 10688 master.cpp:340] Master enabling authorization
> I0529 09:17:49.388056 10688 hierarchical_allocator_process.hpp:301] Initializing hierarchical
allocator process with master : master@127.0.1.1:52356
> I0529 09:17:49.388099 10688 master.cpp:108] No whitelist given. Advertising offers for
all slaves
> I0529 09:17:49.388303 10688 master.cpp:957] The newly elected leader is master@127.0.1.1:52356
with id 20140529-091749-16842879-52356-10663
> I0529 09:17:49.388319 10688 master.cpp:970] Elected as the leading master!
> I0529 09:17:49.388326 10688 master.cpp:788] Recovering from registrar
> I0529 09:17:49.388406 10688 registrar.cpp:313] Recovering registrar
> I0529 09:17:49.394222 10687 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb
took 11.214822ms
> I0529 09:17:49.394258 10687 replica.cpp:320] Persisted replica status to STARTING
> I0529 09:17:49.394351 10687 recover.cpp:451] Replica is in STARTING status
> I0529 09:17:49.394721 10687 replica.cpp:638] Replica in STARTING status received a broadcasted
recover request
> I0529 09:17:49.394791 10687 recover.cpp:188] Received a recover response from a replica
in STARTING status
> I0529 09:17:49.394917 10687 recover.cpp:542] Updating replica status to VOTING
> I0529 09:17:49.405922 10687 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb
took 10.925288ms
> I0529 09:17:49.405954 10687 replica.cpp:320] Persisted replica status to VOTING
> I0529 09:17:49.406010 10687 recover.cpp:556] Successfully joined the Paxos group
> I0529 09:17:49.406075 10687 recover.cpp:440] Recover process terminated
> I0529 09:17:49.406213 10687 log.cpp:656] Attempting to start the writer
> I0529 09:17:49.406582 10687 replica.cpp:474] Replica received implicit promise request
with proposal 1
> I0529 09:17:49.421028 10687 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb
took 14.422122ms
> I0529 09:17:49.421394 10687 replica.cpp:342] Persisted promised to 1
> I0529 09:17:49.424897 10683 coordinator.cpp:230] Coordinator attemping to fill missing
position
> I0529 09:17:49.425268 10683 replica.cpp:375] Replica received explicit promise request
for position 0 with proposal 2
> I0529 09:17:49.434659 10683 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took
9.367453ms
> I0529 09:17:49.434691 10683 replica.cpp:676] Persisted action at 0
> I0529 09:17:49.434999 10683 replica.cpp:508] Replica received write request for position
0
> I0529 09:17:49.435039 10683 leveldb.cpp:438] Reading position from leveldb took 23336ns
> I0529 09:17:49.447484 10683 leveldb.cpp:343] Persisting action (14 bytes) to leveldb
took 12.415731ms
> I0529 09:17:49.447532 10683 replica.cpp:676] Persisted action at 0
> I0529 09:17:49.447757 10683 replica.cpp:655] Replica received learned notice for position
0
> I0529 09:17:49.457741 10683 leveldb.cpp:343] Persisting action (16 bytes) to leveldb
took 9.958881ms
> I0529 09:17:49.458118 10683 replica.cpp:676] Persisted action at 0
> I0529 09:17:49.458366 10683 replica.cpp:661] Replica learned NOP action at position 0
> I0529 09:17:49.458765 10683 log.cpp:672] Writer started with ending position 0
> I0529 09:17:49.459067 10683 leveldb.cpp:438] Reading position from leveldb took 15621ns
> I0529 09:17:49.464747 10683 registrar.cpp:346] Successfully fetched the registry (0B)
> I0529 09:17:49.464779 10683 registrar.cpp:422] Attempting to update the 'registry'
> I0529 09:17:49.465868 10683 log.cpp:680] Attempting to append 120 bytes to the log
> I0529 09:17:49.465922 10683 coordinator.cpp:340] Coordinator attempting to write APPEND
action at position 1
> I0529 09:17:49.466166 10683 replica.cpp:508] Replica received write request for position
1
> I0529 09:17:49.476133 10683 leveldb.cpp:343] Persisting action (137 bytes) to leveldb
took 9.945023ms
> I0529 09:17:49.476505 10683 replica.cpp:676] Persisted action at 1
> I0529 09:17:49.480896 10682 replica.cpp:655] Replica received learned notice for position
1
> I0529 09:17:49.490739 10682 leveldb.cpp:343] Persisting action (139 bytes) to leveldb
took 9.811658ms
> I0529 09:17:49.490782 10682 replica.cpp:676] Persisted action at 1
> I0529 09:17:49.490793 10682 replica.cpp:661] Replica learned APPEND action at position
1
> I0529 09:17:49.491188 10682 registrar.cpp:479] Successfully updated 'registry'
> I0529 09:17:49.491233 10682 registrar.cpp:372] Successfully recovered registrar
> I0529 09:17:49.491297 10682 log.cpp:699] Attempting to truncate the log to 1
> I0529 09:17:49.491391 10682 master.cpp:815] Recovered 0 slaves from the Registry (84B)
; allowing 10mins for slaves to re-register
> I0529 09:17:49.491437 10682 coordinator.cpp:340] Coordinator attempting to write TRUNCATE
action at position 2
> I0529 09:17:49.492131 10688 replica.cpp:508] Replica received write request for position
2
> I0529 09:17:49.493876 10663 sched.cpp:126] Version: 0.19.0
> I0529 09:17:49.495003 10683 process.cpp:3335] Handling HTTP event for process 'metrics'
with path: '/metrics//snapshot'
> I0529 09:17:49.497203 10683 sched.cpp:222] New master detected at master@127.0.1.1:52356
> I0529 09:17:49.498625 10683 sched.cpp:273] Authenticating with master master@127.0.1.1:52356
> I0529 09:17:49.499083 10689 authenticatee.hpp:128] Creating new client SASL connection
> I0529 09:17:49.499282 10689 master.cpp:2986] Authenticating scheduler(121)@127.0.1.1:52356
> I0529 09:17:49.499402 10689 authenticator.hpp:156] Creating new server SASL connection
> I0529 09:17:49.499832 10689 authenticatee.hpp:219] Received SASL authentication mechanisms:
CRAM-MD5
> I0529 09:17:49.499863 10689 authenticatee.hpp:245] Attempting to authenticate with mechanism
'CRAM-MD5'
> I0529 09:17:49.499894 10689 authenticator.hpp:262] Received SASL authentication start
> I0529 09:17:49.499938 10689 authenticator.hpp:384] Authentication requires more steps
> I0529 09:17:49.499969 10689 authenticatee.hpp:265] Received SASL authentication step
> I0529 09:17:49.500015 10689 authenticator.hpp:290] Received SASL authentication step
> I0529 09:17:49.500041 10689 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'quantal' server FQDN: 'quantal' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: false 
> I0529 09:17:49.500049 10689 auxprop.cpp:153] Looking up auxiliary property '*userPassword'
> I0529 09:17:49.500066 10689 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
> I0529 09:17:49.500080 10689 auxprop.cpp:81] Request to lookup properties for user: 'test-principal'
realm: 'quantal' server FQDN: 'quantal' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE:
false SASL_AUXPROP_AUTHZID: true 
> I0529 09:17:49.500087 10689 auxprop.cpp:103] Skipping auxiliary property '*userPassword'
since SASL_AUXPROP_AUTHZID == true
> I0529 09:17:49.500092 10689 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5'
since SASL_AUXPROP_AUTHZID == true
> I0529 09:17:49.500103 10689 authenticator.hpp:376] Authentication success
> I0529 09:17:49.500134 10689 authenticatee.hpp:305] Authentication success
> I0529 09:17:49.500154 10689 master.cpp:3026] Successfully authenticated principal 'test-principal'
at scheduler(121)@127.0.1.1:52356
> I0529 09:17:49.501504 10688 leveldb.cpp:343] Persisting action (16 bytes) to leveldb
took 9.345699ms
> I0529 09:17:49.501533 10688 replica.cpp:676] Persisted action at 2
> I0529 09:17:49.505563 10688 replica.cpp:655] Replica received learned notice for position
2
> I0529 09:17:49.505825 10690 process.cpp:1037] Socket closed while receiving
> I0529 09:17:49.506767 10663 master.cpp:574] Master terminating
> I0529 09:17:49.511685 10683 sched.cpp:347] Successfully authenticated with master master@127.0.1.1:52356
> I0529 09:17:49.512223 10683 sched.cpp:466] Sending registration request to master@127.0.1.1:52356
> I0529 09:17:49.512573 10683 sched.cpp:730] Stopping framework ''
> I0529 09:17:49.515702 10688 leveldb.cpp:343] Persisting action (18 bytes) to leveldb
took 10.071294ms
> I0529 09:17:49.515766 10688 leveldb.cpp:401] Deleting ~1 keys from leveldb took 31762ns
> I0529 09:17:49.515781 10688 replica.cpp:676] Persisted action at 2
> I0529 09:17:49.515792 10688 replica.cpp:661] Replica learned TRUNCATE action at position
2
> tests/scheduler_tests.cpp:70: Failure
> Actual function call count doesn't match EXPECT_CALL(sched, registered(&driver, _,
_))...
>          Expected: to be called once
>            Actual: never called - unsatisfied and active
> [  FAILED  ] SchedulerTest.MetricsEndpoint (184 ms)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message