mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Bannier (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-8733) OversubscriptionTest.ForwardUpdateSlaveMessage is flaky
Date Tue, 27 Mar 2018 16:04:00 GMT

    [ https://issues.apache.org/jira/browse/MESOS-8733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16415841#comment-16415841
] 

Benjamin Bannier commented on MESOS-8733:
-----------------------------------------

This seems to be related to the agent registering multiple times in these tests since the
randomly choosen backoff factor was pick too low (similar to e.g., the issue reported in MESOS-8613).

A more detail log of a failure shows how small backoffs cause multiple registration attempts
by the agent,
{noformat}
[ RUN      ] OversubscriptionTest.ForwardUpdateSlaveMessage
I0327 17:59:30.498467 14537 cluster.cpp:172] Creating default 'local' authorizer
I0327 17:59:30.517834 14564 master.cpp:463] Master f201dade-c73e-42ab-8379-fde33e1d6b29 (gru1.hw.ca1.mesosphere.com)
started on 192.99.40.208:39245
I0327 17:59:30.517881 14564 master.cpp:466] Flags at startup: --acls="" --agent_ping_timeout="15secs"
--agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF"
--authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true"
--authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5"
--authorizers="local" --credentials="/tmp/5vjDkq/credentials" --filter_gpu_resources="true"
--framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic"
--http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true"
--logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50"
--max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --port="5050"
--quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins"
--registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400"
--registry_store_timeout="100secs" --registry_strict="false" --require_agent_domain="false"
--root_submissions="true" --user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui"
--work_dir="/tmp/5vjDkq/master" --zk_session_timeout="10secs"
I0327 17:59:30.518154 14564 master.cpp:515] Master only allowing authenticated frameworks
to register
I0327 17:59:30.518162 14564 master.cpp:521] Master only allowing authenticated agents to register
I0327 17:59:30.518167 14564 master.cpp:527] Master only allowing authenticated HTTP frameworks
to register
I0327 17:59:30.518172 14564 credentials.hpp:37] Loading credentials for authentication from
'/tmp/5vjDkq/credentials'
I0327 17:59:30.518376 14564 master.cpp:571] Using default 'crammd5' authenticator
I0327 17:59:30.518471 14564 http.cpp:959] Creating default 'basic' HTTP authenticator for
realm 'mesos-master-readonly'
I0327 17:59:30.518566 14564 http.cpp:959] Creating default 'basic' HTTP authenticator for
realm 'mesos-master-readwrite'
I0327 17:59:30.518637 14564 http.cpp:959] Creating default 'basic' HTTP authenticator for
realm 'mesos-master-scheduler'
I0327 17:59:30.518702 14564 master.cpp:652] Authorization enabled
I0327 17:59:30.520488 14559 whitelist_watcher.cpp:77] No whitelist given
I0327 17:59:30.520504 14568 hierarchical.cpp:175] Initialized hierarchical allocator process
I0327 17:59:30.522081 14568 master.cpp:2126] Elected as the leading master!
I0327 17:59:30.522099 14568 master.cpp:1682] Recovering from registrar
I0327 17:59:30.522241 14568 registrar.cpp:347] Recovering registrar
I0327 17:59:30.522748 14568 registrar.cpp:391] Successfully fetched the registry (0B) in 484864ns
I0327 17:59:30.522820 14568 registrar.cpp:495] Applied 1 operations in 27646ns; attempting
to update the registry
I0327 17:59:30.523268 14568 registrar.cpp:552] Successfully updated the registry in 419840ns
I0327 17:59:30.523350 14568 registrar.cpp:424] Successfully recovered registrar
I0327 17:59:30.523630 14568 master.cpp:1796] Recovered 0 agents from the registry (170B);
allowing 10mins for agents to reregister
I0327 17:59:30.523769 14568 hierarchical.cpp:213] Skipping recovery of hierarchical allocator:
nothing to recover
W0327 17:59:30.552417 14537 process.cpp:2805] Attempted to spawn already running process files@192.99.40.208:39245
I0327 17:59:30.569679 14537 containerizer.cpp:304] Using isolation { environment_secret, filesystem/posix,
network/cni, posix/cpu, posix/mem }
W0327 17:59:30.577472 14537 backend.cpp:76] Failed to create 'bind' backend: BindBackend requires
root privileges
I0327 17:59:30.577519 14537 provisioner.cpp:299] Using default backend 'copy'
I0327 17:59:30.601388 14537 cluster.cpp:460] Creating default 'local' authorizer
I0327 17:59:30.604290 14553 slave.cpp:261] Mesos agent started on (49)@192.99.40.208:39245
I0327 17:59:30.604341 14553 slave.cpp:262] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://"
--appc_store_dir="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_6G3q4g/store/appc" --authenticate_http_readonly="true"
--authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs"
--authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs"
--containerizers="mesos" --credential="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_6G3q4g/credential"
--default_role="*" --disallow_sharing_agent_pid_namespace="false" --disk_watch_interval="1mins"
--docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io"
--docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns"
--docker_store_dir="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_6G3q4g/store/docker"
--docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false"
--executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs"
--fetcher_cache_dir="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_6G3q4g/fetch" --fetcher_cache_size="2GB"
--frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false"
--hostname_lookup="true" --http_command_executor="false" --http_credentials="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_6G3q4g/http_credentials"
--http_heartbeat_interval="30secs" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem"
--launcher="posix" --launcher_dir="/home/bbannier/src/mesos/_/src" --logbufsecs="0" --logging_level="INFO"
--max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs"
--perf_duration="10secs" --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns"
--quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins"
--registration_backoff_factor="10ms" --resources="cpus:2;gpus:0;mem:1024;disk:1024;ports:[31000-32000]"
--revocable_cpu_low_priority="true" --runtime_dir="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_6G3q4g"
--sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_enable_support="true"
--systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_UPSwHQ"
--zk_session_timeout="10secs"
I0327 17:59:30.604691 14553 credentials.hpp:86] Loading credential for authentication from
'/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_6G3q4g/credential'
I0327 17:59:30.604830 14553 slave.cpp:294] Agent using credential for: test-principal
I0327 17:59:30.604861 14553 credentials.hpp:37] Loading credentials for authentication from
'/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_6G3q4g/http_credentials'
I0327 17:59:30.605034 14553 http.cpp:959] Creating default 'basic' HTTP authenticator for
realm 'mesos-agent-readonly'
I0327 17:59:30.605226 14553 disk_profile_adaptor.cpp:80] Creating default disk profile adaptor
module
I0327 17:59:30.606145 14553 slave.cpp:609] Agent resources: [{"name":"cpus","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":1024.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":1024.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
I0327 17:59:30.606341 14553 slave.cpp:617] Agent attributes: [  ]
I0327 17:59:30.606360 14553 slave.cpp:626] Agent hostname: gru1.hw.ca1.mesosphere.com
I0327 17:59:30.607257 14566 task_status_update_manager.cpp:181] Pausing sending task status
updates
I0327 17:59:30.607941 14553 state.cpp:66] Recovering state from '/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_UPSwHQ/meta'
I0327 17:59:30.608186 14559 task_status_update_manager.cpp:207] Recovering task status update
manager
I0327 17:59:30.608361 14557 containerizer.cpp:674] Recovering containerizer
I0327 17:59:30.609563 14557 provisioner.cpp:495] Provisioner recovery complete
I0327 17:59:30.610193 14553 slave.cpp:7212] Finished recovery
I0327 17:59:30.630548 14553 slave.cpp:1260] New master detected at master@192.99.40.208:39245
I0327 17:59:30.630707 14553 slave.cpp:1315] Detecting new master
I0327 17:59:30.630625 14555 task_status_update_manager.cpp:181] Pausing sending task status
updates
I0327 17:59:30.637944 14558 slave.cpp:1342] Authenticating with master master@192.99.40.208:39245
I0327 17:59:30.637995 14558 slave.cpp:1351] Using default CRAM-MD5 authenticatee
I0327 17:59:30.638164 14558 authenticatee.cpp:121] Creating new client SASL connection
I0327 17:59:30.638337 14558 master.cpp:9206] Authenticating slave(49)@192.99.40.208:39245
I0327 17:59:30.638453 14558 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(49)@192.99.40.208:39245
I0327 17:59:30.638623 14558 authenticator.cpp:98] Creating new server SASL connection
I0327 17:59:30.638744 14558 authenticatee.cpp:213] Received SASL authentication mechanisms:
CRAM-MD5
I0327 17:59:30.638761 14558 authenticatee.cpp:239] Attempting to authenticate with mechanism
'CRAM-MD5'
I0327 17:59:30.638825 14558 authenticator.cpp:204] Received SASL authentication start
I0327 17:59:30.638869 14558 authenticator.cpp:326] Authentication requires more steps
I0327 17:59:30.638929 14558 authenticatee.cpp:259] Received SASL authentication step
I0327 17:59:30.639001 14558 authenticator.cpp:232] Received SASL authentication step
I0327 17:59:30.639019 14558 auxprop.cpp:109] Request to lookup properties for user: 'test-principal'
realm: 'gru1.hw.ca1.mesosphere.com' server FQDN: 'gru1.hw.ca1.mesosphere.com' SASL_AUXPROP_VERIFY_AGAINST_HASH:
false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
I0327 17:59:30.639027 14558 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
I0327 17:59:30.639051 14558 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0327 17:59:30.639065 14558 auxprop.cpp:109] Request to lookup properties for user: 'test-principal'
realm: 'gru1.hw.ca1.mesosphere.com' server FQDN: 'gru1.hw.ca1.mesosphere.com' SASL_AUXPROP_VERIFY_AGAINST_HASH:
false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
I0327 17:59:30.639071 14558 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since
SASL_AUXPROP_AUTHZID == true
I0327 17:59:30.639077 14558 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5'
since SASL_AUXPROP_AUTHZID == true
I0327 17:59:30.639088 14558 authenticator.cpp:318] Authentication success
I0327 17:59:30.639194 14558 authenticatee.cpp:299] Authentication success
I0327 17:59:30.639256 14558 master.cpp:9236] Successfully authenticated principal 'test-principal'
at slave(49)@192.99.40.208:39245
I0327 17:59:30.639313 14558 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(49)@192.99.40.208:39245
I0327 17:59:30.639504 14558 slave.cpp:1434] Successfully authenticated with master master@192.99.40.208:39245
I0327 17:59:30.639747 14558 slave.cpp:1877] Will retry registration in 3.127902ms if necessary
I0327 17:59:30.639966 14558 master.cpp:6326] Received register agent message from slave(49)@192.99.40.208:39245
(gru1.hw.ca1.mesosphere.com)
I0327 17:59:30.640077 14558 master.cpp:3802] Authorizing agent providing resources 'cpus:2;
mem:1024; disk:1024; ports:[31000-32000]' with principal 'test-principal'
I0327 17:59:30.640739 14558 master.cpp:6397] Authorized registration of agent at slave(49)@192.99.40.208:39245
(gru1.hw.ca1.mesosphere.com)
I0327 17:59:30.640861 14558 master.cpp:6509] Registering agent at slave(49)@192.99.40.208:39245
(gru1.hw.ca1.mesosphere.com) with id f201dade-c73e-42ab-8379-fde33e1d6b29-S0
I0327 17:59:30.641232 14558 registrar.cpp:495] Applied 1 operations in 134909ns; attempting
to update the registry
I0327 17:59:30.641722 14558 registrar.cpp:552] Successfully updated the registry in 434944ns
I0327 17:59:30.641881 14558 master.cpp:6557] Admitted agent f201dade-c73e-42ab-8379-fde33e1d6b29-S0
at slave(49)@192.99.40.208:39245 (gru1.hw.ca1.mesosphere.com)
I0327 17:59:30.642678 14562 hierarchical.cpp:574] Added agent f201dade-c73e-42ab-8379-fde33e1d6b29-S0
(gru1.hw.ca1.mesosphere.com) with cpus:2; mem:1024; disk:1024; ports:[31000-32000] (allocated:
{})
I0327 17:59:30.642866 14562 hierarchical.cpp:1517] Performed allocation for 1 agents in 94663ns
I0327 17:59:30.646689 14555 slave.cpp:1877] Will retry registration in 19.583324ms if necessary
I0327 17:59:30.646885 14558 master.cpp:6602] Registered agent f201dade-c73e-42ab-8379-fde33e1d6b29-S0
at slave(49)@192.99.40.208:39245 (gru1.hw.ca1.mesosphere.com) with cpus:2; mem:1024; disk:1024;
ports:[31000-32000]
I0327 17:59:30.647135 14558 master.cpp:6326] Received register agent message from slave(49)@192.99.40.208:39245
(gru1.hw.ca1.mesosphere.com)
I0327 17:59:30.654551 14558 master.cpp:3802] Authorizing agent providing resources 'cpus:2;
mem:1024; disk:1024; ports:[31000-32000]' with principal 'test-principal'
I0327 17:59:30.647011 14553 slave.cpp:1481] Registered with master master@192.99.40.208:39245;
given agent ID f201dade-c73e-42ab-8379-fde33e1d6b29-S0
I0327 17:59:30.655179 14553 slave.cpp:1501] Checkpointing SlaveInfo to '/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_UPSwHQ/meta/slaves/f201dade-c73e-42ab-8379-fde33e1d6b29-S0/slave.info'
I0327 17:59:30.655529 14553 slave.cpp:1548] Forwarding agent update {"operations":{},"resource_version_uuid":{"value":"OOcru9bcSaChQRTLZuysfA=="},"slave_id":{"value":"f201dade-c73e-42ab-8379-fde33e1d6b29-S0"},"update_oversubscribed_resources":false}
I0327 17:59:30.655226 14556 task_status_update_manager.cpp:188] Resuming sending task status
updates
I0327 17:59:30.663825 14558 master.cpp:7639] Ignoring update on agent f201dade-c73e-42ab-8379-fde33e1d6b29-S0
at slave(49)@192.99.40.208:39245 (gru1.hw.ca1.mesosphere.com) as it reports no changes
I0327 17:59:30.676502 14558 master.cpp:6397] Authorized registration of agent at slave(49)@192.99.40.208:39245
(gru1.hw.ca1.mesosphere.com)
I0327 17:59:30.676597 14558 master.cpp:6488] Agent f201dade-c73e-42ab-8379-fde33e1d6b29-S0
at slave(49)@192.99.40.208:39245 (gru1.hw.ca1.mesosphere.com) already registered, resending
acknowledgement
W0327 17:59:30.676754 14558 slave.cpp:1530] Already registered with master master@192.99.40.208:39245
I0327 17:59:30.676802 14558 slave.cpp:1548] Forwarding agent update {"operations":{},"resource_version_uuid":{"value":"OOcru9bcSaChQRTLZuysfA=="},"slave_id":{"value":"f201dade-c73e-42ab-8379-fde33e1d6b29-S0"},"update_oversubscribed_resources":false}
I0327 17:59:30.677342 14558 master.cpp:7639] Ignoring update on agent f201dade-c73e-42ab-8379-fde33e1d6b29-S0
at slave(49)@192.99.40.208:39245 (gru1.hw.ca1.mesosphere.com) as it reports no changes
I0327 17:59:30.666944 14559 hierarchical.cpp:1517] Performed allocation for 1 agents in 86426ns
../src/tests/oversubscription_tests.cpp:319: Failure
Value of: update.isReady()
  Actual: true
Expected: false
{noformat}

> OversubscriptionTest.ForwardUpdateSlaveMessage is flaky
> -------------------------------------------------------
>
>                 Key: MESOS-8733
>                 URL: https://issues.apache.org/jira/browse/MESOS-8733
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 1.6.0
>            Reporter: Benjamin Bannier
>            Assignee: Benjamin Bannier
>            Priority: Major
>              Labels: flaky-test, mesosphere
>
> Observed this failure in CI,
> {noformat}
> [ RUN ] OversubscriptionTest.ForwardUpdateSlaveMessage
> 3: I0327 10:12:04.032042 18320 cluster.cpp:172] Creating default 'local' authorizer
> 3: I0327 10:12:04.035696 18321 master.cpp:463] Master b5c97327-11cc-4183-82ed-75e62b71cc58
(1931c74e0c4c) started on 172.17.0.2:35020
> 3: I0327 10:12:04.035732 18321 master.cpp:466] Flags at startup: --acls="" --agent_ping_timeout="15secs"
--agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF"
--authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true"
--authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5"
--authorizers="local" --credentials="/tmp/4j65Va/credentials" --filter_gpu_resources="true"
--framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic"
--http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true"
--logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50"
--max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --port="5050"
--quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins"
--registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400"
--registry_store_timeout="100secs" --registry_strict="false" --require_agent_domain="false"
--root_submissions="true" --user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui"
--work_dir="/tmp/4j65Va/master" --zk_session_timeout="10secs"
> 3: I0327 10:12:04.036129 18321 master.cpp:515] Master only allowing authenticated frameworks
to register
> 3: I0327 10:12:04.036140 18321 master.cpp:521] Master only allowing authenticated agents
to register
> 3: I0327 10:12:04.036147 18321 master.cpp:527] Master only allowing authenticated HTTP
frameworks to register
> 3: I0327 10:12:04.036156 18321 credentials.hpp:37] Loading credentials for authentication
from '/tmp/4j65Va/credentials'
> 3: I0327 10:12:04.036468 18321 master.cpp:571] Using default 'crammd5' authenticator
> 3: I0327 10:12:04.036643 18321 http.cpp:959] Creating default 'basic' HTTP authenticator
for realm 'mesos-master-readonly'
> 3: I0327 10:12:04.036834 18321 http.cpp:959] Creating default 'basic' HTTP authenticator
for realm 'mesos-master-readwrite'
> 3: I0327 10:12:04.037005 18321 http.cpp:959] Creating default 'basic' HTTP authenticator
for realm 'mesos-master-scheduler'
> 3: I0327 10:12:04.037170 18321 master.cpp:652] Authorization enabled
> 3: I0327 10:12:04.037370 18338 whitelist_watcher.cpp:77] No whitelist given
> 3: I0327 10:12:04.037374 18322 hierarchical.cpp:175] Initialized hierarchical allocator
process
> 3: I0327 10:12:04.040787 18321 master.cpp:2126] Elected as the leading master!
> 3: I0327 10:12:04.040812 18321 master.cpp:1682] Recovering from registrar
> 3: I0327 10:12:04.040966 18342 registrar.cpp:347] Recovering registrar
> 3: I0327 10:12:04.041606 18330 registrar.cpp:391] Successfully fetched the registry (0B)
in 590848ns
> 3: I0327 10:12:04.041764 18330 registrar.cpp:495] Applied 1 operations in 57052ns; attempting
to update the registry
> 3: I0327 10:12:04.042466 18330 registrar.cpp:552] Successfully updated the registry in
638976ns
> 3: I0327 10:12:04.042615 18330 registrar.cpp:424] Successfully recovered registrar
> 3: I0327 10:12:04.043128 18339 master.cpp:1796] Recovered 0 agents from the registry
(135B); allowing 10mins for agents to reregister
> 3: I0327 10:12:04.043151 18326 hierarchical.cpp:213] Skipping recovery of hierarchical
allocator: nothing to recover
> 3: W0327 10:12:04.048898 18320 process.cpp:2805] Attempted to spawn already running process
files@172.17.0.2:35020
> 3: I0327 10:12:04.050076 18320 containerizer.cpp:304] Using isolation { environment_secret,
posix/cpu, posix/mem, filesystem/posix, network/cni }
> 3: W0327 10:12:04.050720 18320 backend.cpp:76] Failed to create 'aufs' backend: AufsBackend
requires root privileges
> 3: W0327 10:12:04.050746 18320 backend.cpp:76] Failed to create 'bind' backend: BindBackend
requires root privileges
> 3: I0327 10:12:04.050791 18320 provisioner.cpp:299] Using default backend 'copy'
> 3: I0327 10:12:04.053491 18320 cluster.cpp:460] Creating default 'local' authorizer
> 3: I0327 10:12:04.056531 18326 slave.cpp:261] Mesos agent started on (546)@172.17.0.2:35020
> 3: I0327 10:12:04.056571 18326 slave.cpp:262] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://"
--appc_store_dir="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_YeoNx5/store/appc" --authenticate_http_executors="true"
--authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticatee="crammd5"
--authentication_backoff_factor="1secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false"
--cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false"
--cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_YeoNx5/credential"
--default_role="*" --disallow_sharing_agent_pid_namespace="false" --disk_watch_interval="1mins"
--docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io"
--docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns"
--docker_store_dir="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_YeoNx5/store/docker"
--docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false"
--executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs"
--fetcher_cache_dir="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_YeoNx5/fetch" --fetcher_cache_size="2GB"
--frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false"
--hostname_lookup="true" --http_command_executor="false" --http_credentials="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_YeoNx5/http_credentials"
--http_heartbeat_interval="30secs" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem"
--jwt_secret_key="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_YeoNx5/jwt_secret_key"
--launcher="posix" --launcher_dir="/tmp/SRC/build/src" --logbufsecs="0" --logging_level="INFO"
--max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs"
--perf_duration="10secs" --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns"
--quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins"
--registration_backoff_factor="10ms" --resources="cpus:2;gpus:0;mem:1024;disk:1024;ports:[31000-32000]"
--revocable_cpu_low_priority="true" --runtime_dir="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_YeoNx5"
--sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_enable_support="true"
--systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_8qkWeD"
--zk_session_timeout="10secs"
> 3: I0327 10:12:04.057035 18326 credentials.hpp:86] Loading credential for authentication
from '/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_YeoNx5/credential'
> 3: I0327 10:12:04.057212 18326 slave.cpp:294] Agent using credential for: test-principal
> 3: I0327 10:12:04.057235 18326 credentials.hpp:37] Loading credentials for authentication
from '/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_YeoNx5/http_credentials'
> 3: I0327 10:12:04.057521 18326 http.cpp:959] Creating default 'basic' HTTP authenticator
for realm 'mesos-agent-executor'
> 3: I0327 10:12:04.057674 18326 http.cpp:980] Creating default 'jwt' HTTP authenticator
for realm 'mesos-agent-executor'
> 3: I0327 10:12:04.057922 18326 http.cpp:959] Creating default 'basic' HTTP authenticator
for realm 'mesos-agent-readonly'
> 3: I0327 10:12:04.058051 18326 http.cpp:980] Creating default 'jwt' HTTP authenticator
for realm 'mesos-agent-readonly'
> 3: I0327 10:12:04.058272 18326 http.cpp:959] Creating default 'basic' HTTP authenticator
for realm 'mesos-agent-readwrite'
> 3: I0327 10:12:04.058408 18326 http.cpp:980] Creating default 'jwt' HTTP authenticator
for realm 'mesos-agent-readwrite'
> 3: I0327 10:12:04.058784 18326 disk_profile_adaptor.cpp:80] Creating default disk profile
adaptor module
> 3: I0327 10:12:04.060353 18326 slave.cpp:609] Agent resources: [{"name":"cpus","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":1024.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":1024.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
> 3: I0327 10:12:04.060569 18326 slave.cpp:617] Agent attributes: [ ]
> 3: I0327 10:12:04.060583 18326 slave.cpp:626] Agent hostname: 1931c74e0c4c
> 3: I0327 10:12:04.060739 18330 task_status_update_manager.cpp:181] Pausing sending task
status updates
> 3: I0327 10:12:04.062536 18331 state.cpp:66] Recovering state from '/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_8qkWeD/meta'
> 3: I0327 10:12:04.062916 18322 task_status_update_manager.cpp:207] Recovering task status
update manager
> 3: I0327 10:12:04.063143 18323 containerizer.cpp:674] Recovering containerizer
> 3: I0327 10:12:04.064961 18330 provisioner.cpp:495] Provisioner recovery complete
> 3: I0327 10:12:04.065325 18336 slave.cpp:7212] Finished recovery
> 3: I0327 10:12:04.066190 18331 task_status_update_manager.cpp:181] Pausing sending task
status updates
> 3: I0327 10:12:04.066213 18336 slave.cpp:1260] New master detected at master@172.17.0.2:35020
> 3: I0327 10:12:04.066336 18336 slave.cpp:1315] Detecting new master
> 3: I0327 10:12:04.067641 18338 slave.cpp:1342] Authenticating with master master@172.17.0.2:35020
> 3: I0327 10:12:04.067776 18338 slave.cpp:1351] Using default CRAM-MD5 authenticatee
> 3: I0327 10:12:04.068178 18322 authenticatee.cpp:121] Creating new client SASL connection
> 3: I0327 10:12:04.068650 18324 master.cpp:9206] Authenticating slave(546)@172.17.0.2:35020
> 3: I0327 10:12:04.068862 18321 authenticator.cpp:414] Starting authentication session
for crammd5-authenticatee(1085)@172.17.0.2:35020
> 3: I0327 10:12:04.069332 18327 authenticator.cpp:98] Creating new server SASL connection
> 3: I0327 10:12:04.069733 18335 authenticatee.cpp:213] Received SASL authentication mechanisms:
CRAM-MD5
> 3: I0327 10:12:04.069778 18335 authenticatee.cpp:239] Attempting to authenticate with
mechanism 'CRAM-MD5'
> 3: I0327 10:12:04.070008 18332 authenticator.cpp:204] Received SASL authentication start
> 3: I0327 10:12:04.070113 18332 authenticator.cpp:326] Authentication requires more steps
> 3: I0327 10:12:04.070336 18323 authenticatee.cpp:259] Received SASL authentication step
> 3: I0327 10:12:04.070583 18342 authenticator.cpp:232] Received SASL authentication step
> 3: I0327 10:12:04.070636 18342 auxprop.cpp:109] Request to lookup properties for user:
'test-principal' realm: '1931c74e0c4c' server FQDN: '1931c74e0c4c' SASL_AUXPROP_VERIFY_AGAINST_HASH:
false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
> 3: I0327 10:12:04.070659 18342 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
> 3: I0327 10:12:04.070724 18342 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
> 3: I0327 10:12:04.070760 18342 auxprop.cpp:109] Request to lookup properties for user:
'test-principal' realm: '1931c74e0c4c' server FQDN: '1931c74e0c4c' SASL_AUXPROP_VERIFY_AGAINST_HASH:
false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
> 3: I0327 10:12:04.070824 18342 auxprop.cpp:131] Skipping auxiliary property '*userPassword'
since SASL_AUXPROP_AUTHZID == true
> 3: I0327 10:12:04.070832 18342 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5'
since SASL_AUXPROP_AUTHZID == true
> 3: I0327 10:12:04.070847 18342 authenticator.cpp:318] Authentication success
> 3: I0327 10:12:04.070940 18334 authenticatee.cpp:299] Authentication success
> 3: I0327 10:12:04.071063 18333 master.cpp:9236] Successfully authenticated principal
'test-principal' at slave(546)@172.17.0.2:35020
> 3: I0327 10:12:04.071118 18337 authenticator.cpp:432] Authentication session cleanup
for crammd5-authenticatee(1085)@172.17.0.2:35020
> 3: I0327 10:12:04.071286 18328 slave.cpp:1434] Successfully authenticated with master
master@172.17.0.2:35020
> 3: I0327 10:12:04.071718 18328 slave.cpp:1877] Will retry registration in 383294ns if
necessary
> 3: I0327 10:12:04.071923 18330 master.cpp:6326] Received register agent message from
slave(546)@172.17.0.2:35020 (1931c74e0c4c)
> 3: I0327 10:12:04.072154 18330 master.cpp:3802] Authorizing agent providing resources
'cpus:2; mem:1024; disk:1024; ports:[31000-32000]' with principal 'test-principal'
> 3: I0327 10:12:04.072834 18331 master.cpp:6397] Authorized registration of agent at slave(546)@172.17.0.2:35020
(1931c74e0c4c)
> 3: I0327 10:12:04.072928 18331 master.cpp:6509] Registering agent at slave(546)@172.17.0.2:35020
(1931c74e0c4c) with id b5c97327-11cc-4183-82ed-75e62b71cc58-S0
> 3: I0327 10:12:04.073508 18329 registrar.cpp:495] Applied 1 operations in 237308ns; attempting
to update the registry
> 3: I0327 10:12:04.074270 18321 registrar.cpp:552] Successfully updated the registry in
675072ns
> 3: I0327 10:12:04.074518 18335 master.cpp:6557] Admitted agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
at slave(546)@172.17.0.2:35020 (1931c74e0c4c)
> 3: I0327 10:12:04.075176 18335 master.cpp:6602] Registered agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
at slave(546)@172.17.0.2:35020 (1931c74e0c4c) with cpus:2; mem:1024; disk:1024; ports:[31000-32000]
> 3: I0327 10:12:04.075368 18323 slave.cpp:1877] Will retry registration in 26.831215ms
if necessary
> 3: I0327 10:12:04.075518 18342 master.cpp:6326] Received register agent message from
slave(546)@172.17.0.2:35020 (1931c74e0c4c)
> 3: I0327 10:12:04.075597 18323 slave.cpp:1481] Registered with master master@172.17.0.2:35020;
given agent ID b5c97327-11cc-4183-82ed-75e62b71cc58-S0
> 3: I0327 10:12:04.075626 18334 hierarchical.cpp:574] Added agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
(1931c74e0c4c) with cpus:2; mem:1024; disk:1024; ports:[31000-32000] (allocated: {})
> 3: I0327 10:12:04.075739 18341 task_status_update_manager.cpp:188] Resuming sending task
status updates
> 3: I0327 10:12:04.075709 18342 master.cpp:3802] Authorizing agent providing resources
'cpus:2; mem:1024; disk:1024; ports:[31000-32000]' with principal 'test-principal'
> 3: I0327 10:12:04.075896 18323 slave.cpp:1501] Checkpointing SlaveInfo to '/tmp/OversubscriptionTest_ForwardUpdateSlaveMessage_8qkWeD/meta/slaves/b5c97327-11cc-4183-82ed-75e62b71cc58-S0/slave.info'
> 3: I0327 10:12:04.075943 18334 hierarchical.cpp:1517] Performed allocation for 1 agents
in 169342ns
> 3: I0327 10:12:04.076222 18339 master.cpp:6397] Authorized registration of agent at slave(546)@172.17.0.2:35020
(1931c74e0c4c)
> 3: I0327 10:12:04.076292 18339 master.cpp:6488] Agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
at slave(546)@172.17.0.2:35020 (1931c74e0c4c) already registered, resending acknowledgement
> 3: I0327 10:12:04.076493 18323 slave.cpp:1548] Forwarding agent update {"operations":{},"resource_version_uuid":{"value":"rd+fCEbpQsWYa07c\/1tXpw=="},"slave_id":{"value":"b5c97327-11cc-4183-82ed-75e62b71cc58-S0"},"update_oversubscribed_resources":false}
> 3: W0327 10:12:04.076702 18323 slave.cpp:1530] Already registered with master master@172.17.0.2:35020
> 3: I0327 10:12:04.076735 18323 slave.cpp:1548] Forwarding agent update {"operations":{},"resource_version_uuid":{"value":"rd+fCEbpQsWYa07c\/1tXpw=="},"slave_id":{"value":"b5c97327-11cc-4183-82ed-75e62b71cc58-S0"},"update_oversubscribed_resources":false}
> 3: I0327 10:12:04.077424 18343 master.cpp:7639] Ignoring update on agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
at slave(546)@172.17.0.2:35020 (1931c74e0c4c) as it reports no changes
> 3: I0327 10:12:04.078074 18343 master.cpp:7639] Ignoring update on agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
at slave(546)@172.17.0.2:35020 (1931c74e0c4c) as it reports no changes
> 3: I0327 10:12:04.080782 18341 hierarchical.cpp:1517] Performed allocation for 1 agents
in 140840ns
> 3: /tmp/SRC/src/tests/oversubscription_tests.cpp:319: Failure
> 3: Value of: update.isReady()
> 3: Actual: true
> 3: Expected: false
> 3: I0327 10:12:04.082888 18321 slave.cpp:919] Agent terminating
> 3: I0327 10:12:04.083225 18335 master.cpp:1295] Agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
at slave(546)@172.17.0.2:35020 (1931c74e0c4c) disconnected
> 3: I0327 10:12:04.083271 18335 master.cpp:3283] Disconnecting agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
at slave(546)@172.17.0.2:35020 (1931c74e0c4c)
> 3: I0327 10:12:04.083369 18335 master.cpp:3302] Deactivating agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
at slave(546)@172.17.0.2:35020 (1931c74e0c4c)
> 3: I0327 10:12:04.083616 18341 hierarchical.cpp:766] Agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
deactivated
> 3: I0327 10:12:04.092846 18320 master.cpp:1137] Master terminating
> 3: I0327 10:12:04.093572 18323 hierarchical.cpp:609] Removed agent b5c97327-11cc-4183-82ed-75e62b71cc58-S0
> 3: [ FAILED ] OversubscriptionTest.ForwardUpdateSlaveMessage (68 ms){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message