mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Bannier (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (MESOS-8892) MasterSlaveReconciliationTest.ReconcileDroppedOperation is flaky
Date Tue, 08 May 2018 11:34:00 GMT

     [ https://issues.apache.org/jira/browse/MESOS-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benjamin Bannier reassigned MESOS-8892:
---------------------------------------

    Assignee: Benjamin Bannier

> MasterSlaveReconciliationTest.ReconcileDroppedOperation is flaky
> ----------------------------------------------------------------
>
>                 Key: MESOS-8892
>                 URL: https://issues.apache.org/jira/browse/MESOS-8892
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 1.6.0
>            Reporter: Greg Mann
>            Assignee: Benjamin Bannier
>            Priority: Major
>         Attachments: MasterSlaveReconciliationTest.ReconcileDroppedOperation.txt
>
>
> This was observed on a Debian 9 SSL/GRPC-enabled build. It appears that a poorly-timed
{{UpdateSlaveMessage}} leads to the operation reconciliation occurring before the expectation
for the {{ReconcileOperationsMessage}} is registered:
> {code}
> I0508 00:11:09.700815 22498 master.cpp:4362] Processing ACCEPT call for offers: [ f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-O0
] on agent f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0 at slave(212)@127.0.0.1:36309 (localhost)
for framework f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-0000 (default) at scheduler-b0f55e01-2f6f-42c8-8614-901036acfc31@127.0.0.1:36309
> I0508 00:11:09.700870 22498 master.cpp:3602] Authorizing principal 'test-principal' to
reserve resources 'cpus(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):2;
mem(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024;
disk(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024;
ports(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):[31000-32000]'
> I0508 00:11:09.701228 22493 master.cpp:4725] Applying RESERVE operation for resources
[{"allocation_info":{"role":"default-role"},"name":"cpus","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":2.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"mem","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":1024.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"disk","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":1024.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"type":"RANGES"}]
from framework f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-0000 (default) at scheduler-b0f55e01-2f6f-42c8-8614-901036acfc31@127.0.0.1:36309
to agent f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0 at slave(212)@127.0.0.1:36309 (localhost)
> I0508 00:11:09.701498 22493 master.cpp:11265] Sending operation '' (uuid: 81dffb62-6e75-4c6c-a97b-41c92c58d6a7)
to agent f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0 at slave(212)@127.0.0.1:36309 (localhost)
> I0508 00:11:09.701627 22494 slave.cpp:1564] Forwarding agent update {"operations":{},"resource_version_uuid":{"value":"0HeA06ftS6m76SNoNZNPag=="},"slave_id":{"value":"f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0"},"update_oversubscribed_resources":true}
> I0508 00:11:09.701848 22494 master.cpp:7800] Received update of agent f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0
at slave(212)@127.0.0.1:36309 (localhost) with total oversubscribed resources {}
> W0508 00:11:09.701905 22494 master.cpp:7974] Performing explicit reconciliation with
agent for known operation 81dffb62-6e75-4c6c-a97b-41c92c58d6a7 since it was not present in
original reconciliation message from agent
> I0508 00:11:09.702085 22494 master.cpp:11015] Updating the state of operation '' (uuid:
81dffb62-6e75-4c6c-a97b-41c92c58d6a7) for framework f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-0000
(latest state: OPERATION_PENDING, status update state: OPERATION_DROPPED)
> I0508 00:11:09.702239 22491 hierarchical.cpp:925] Updated allocation of framework f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-0000
on agent f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0 from cpus(allocated: default-role):2; mem(allocated:
default-role):1024; disk(allocated: default-role):1024; ports(allocated: default-role):[31000-32000]
to disk(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024;
cpus(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):2; mem(allocated:
default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; ports(allocated:
default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):[31000-32000]
> I0508 00:11:09.702267 22493 slave.cpp:1274] New master detected at master@127.0.0.1:36309
> I0508 00:11:09.702306 22495 task_status_update_manager.cpp:181] Pausing sending task
status updates
> I0508 00:11:09.702337 22493 slave.cpp:1329] Detecting new master
> {code}
> Full log is attached as {{MasterSlaveReconciliationTest.ReconcileDroppedOperation.txt}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message