mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Mann (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MESOS-6137) Segfault during DiskResource/PersistentVolumeTest.IncompatibleCheckpointedResources/0
Date Thu, 08 Sep 2016 04:39:20 GMT
Greg Mann created MESOS-6137:
--------------------------------

             Summary: Segfault during DiskResource/PersistentVolumeTest.IncompatibleCheckpointedResources/0
                 Key: MESOS-6137
                 URL: https://issues.apache.org/jira/browse/MESOS-6137
             Project: Mesos
          Issue Type: Bug
    Affects Versions: 1.0.1
         Environment: Ubuntu 14.04, non-SSL, libev
            Reporter: Greg Mann
            Assignee: Greg Mann


Observed in our internal CI:
{code}
I0906 20:01:45.235483 29082 master.cpp:379] Master 9fd91e5d-4257-427d-a7da-3f18d99c8ffa (0a1dc2da838b)
started on 172.17.0.3:60366
I0906 20:01:45.235513 29082 master.cpp:381] Flags at startup: --acls="" --agent_ping_timeout="15secs"
--agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF"
--authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true"
--authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5"
--authorizers="local" --credentials="/tmp/ze1TG1/credentials" --framework_sorter="drf" --help="false"
--hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic"
--initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO"
--max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000"
--quiet="false" --recovery_agent_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins"
--registry_store_timeout="100secs" --registry_strict="true" --root_submissions="true" --user_sorter="drf"
--version="false" --webui_dir="/mesos/mesos-1.1.0/_inst/share/mesos/webui" --work_dir="/tmp/ze1TG1/master"
--zk_session_timeout="10secs"
I0906 20:01:45.236022 29082 master.cpp:431] Master only allowing authenticated frameworks
to register
I0906 20:01:45.236037 29082 master.cpp:445] Master only allowing authenticated agents to register
I0906 20:01:45.236045 29082 master.cpp:458] Master only allowing authenticated HTTP frameworks
to register
I0906 20:01:45.236054 29082 credentials.hpp:37] Loading credentials for authentication from
'/tmp/ze1TG1/credentials'
I0906 20:01:45.236392 29082 master.cpp:503] Using default 'crammd5' authenticator
I0906 20:01:45.236654 29079 replica.cpp:673] Replica in STARTING status received a broadcasted
recover request from __req_res__(6359)@172.17.0.3:60366
I0906 20:01:45.236687 29082 http.cpp:883] Using default 'basic' HTTP authenticator for realm
'mesos-master-readonly'
I0906 20:01:45.236927 29082 http.cpp:883] Using default 'basic' HTTP authenticator for realm
'mesos-master-readwrite'
I0906 20:01:45.237095 29079 recover.cpp:197] Received a recover response from a replica in
STARTING status
I0906 20:01:45.237117 29082 http.cpp:883] Using default 'basic' HTTP authenticator for realm
'mesos-master-scheduler'
I0906 20:01:45.237340 29082 master.cpp:583] Authorization enabled
I0906 20:01:45.237663 29080 whitelist_watcher.cpp:77] No whitelist given
I0906 20:01:45.237685 29075 hierarchical.cpp:149] Initialized hierarchical allocator process
I0906 20:01:45.237835 29085 recover.cpp:568] Updating replica status to VOTING
I0906 20:01:45.238531 29081 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took
378674ns
I0906 20:01:45.238560 29081 replica.cpp:320] Persisted replica status to VOTING
I0906 20:01:45.238685 29073 recover.cpp:582] Successfully joined the Paxos group
I0906 20:01:45.238975 29073 recover.cpp:466] Recover process terminated
I0906 20:01:45.240437 29078 master.cpp:1850] Elected as the leading master!
I0906 20:01:45.240468 29078 master.cpp:1551] Recovering from registrar
I0906 20:01:45.240592 29080 registrar.cpp:332] Recovering registrar
I0906 20:01:45.241178 29075 log.cpp:553] Attempting to start the writer
I0906 20:01:45.242928 29072 replica.cpp:493] Replica received implicit promise request from
__req_res__(6360)@172.17.0.3:60366 with proposal 1
I0906 20:01:45.243324 29072 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took
335676ns
I0906 20:01:45.243350 29072 replica.cpp:342] Persisted promised to 1
I0906 20:01:45.244056 29081 coordinator.cpp:238] Coordinator attempting to fill missing positions
I0906 20:01:45.245538 29078 replica.cpp:388] Replica received explicit promise request from
__req_res__(6361)@172.17.0.3:60366 for position 0 with proposal 2
I0906 20:01:45.245995 29078 leveldb.cpp:341] Persisting action (8 bytes) to leveldb took 412163ns
I0906 20:01:45.246021 29078 replica.cpp:708] Persisted action NOP at position 0
I0906 20:01:45.247329 29082 replica.cpp:537] Replica received write request for position 0
from __req_res__(6362)@172.17.0.3:60366
I0906 20:01:45.247406 29082 leveldb.cpp:436] Reading position from leveldb took 35845ns
I0906 20:01:45.247989 29082 leveldb.cpp:341] Persisting action (14 bytes) to leveldb took
541972ns
I0906 20:01:45.248015 29082 replica.cpp:708] Persisted action NOP at position 0
I0906 20:01:45.248556 29084 replica.cpp:691] Replica received learned notice for position
0 from @0.0.0.0:0
I0906 20:01:45.249241 29084 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took
647885ns
I0906 20:01:45.249271 29084 replica.cpp:708] Persisted action NOP at position 0
I0906 20:01:45.249914 29085 log.cpp:569] Writer started with ending position 0
I0906 20:01:45.251022 29085 leveldb.cpp:436] Reading position from leveldb took 31388ns
I0906 20:01:45.252149 29082 registrar.cpp:365] Successfully fetched the registry (0B) in 11.51104ms
I0906 20:01:45.252271 29082 registrar.cpp:464] Applied 1 operations in 21341ns; attempting
to update the registry
I0906 20:01:45.253073 29078 log.cpp:577] Attempting to append 168 bytes to the log
I0906 20:01:45.253250 29081 coordinator.cpp:348] Coordinator attempting to write APPEND action
at position 1
I0906 20:01:45.254175 29070 replica.cpp:537] Replica received write request for position 1
from __req_res__(6363)@172.17.0.3:60366
I0906 20:01:45.254654 29070 leveldb.cpp:341] Persisting action (187 bytes) to leveldb took
435222ns
I0906 20:01:45.254683 29070 replica.cpp:708] Persisted action APPEND at position 1
I0906 20:01:45.255455 29080 replica.cpp:691] Replica received learned notice for position
1 from @0.0.0.0:0
I0906 20:01:45.255926 29080 leveldb.cpp:341] Persisting action (189 bytes) to leveldb took
431510ns
I0906 20:01:45.255980 29080 replica.cpp:708] Persisted action APPEND at position 1
I0906 20:01:45.257114 29073 registrar.cpp:509] Successfully updated the registry in 4.780032ms
I0906 20:01:45.257305 29073 registrar.cpp:395] Successfully recovered registrar
I0906 20:01:45.257380 29082 log.cpp:596] Attempting to truncate the log to 1
I0906 20:01:45.257515 29076 coordinator.cpp:348] Coordinator attempting to write TRUNCATE
action at position 2
I0906 20:01:45.258153 29071 master.cpp:1659] Recovered 0 agents from the registry (129B);
allowing 10mins for agents to re-register
I0906 20:01:45.258191 29077 hierarchical.cpp:176] Skipping recovery of hierarchical allocator:
nothing to recover
I0906 20:01:45.258608 29082 replica.cpp:537] Replica received write request for position 2
from __req_res__(6364)@172.17.0.3:60366
I0906 20:01:45.259039 29082 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took
388229ns
I0906 20:01:45.259068 29082 replica.cpp:708] Persisted action TRUNCATE at position 2
I0906 20:01:45.259778 29071 replica.cpp:691] Replica received learned notice for position
2 from @0.0.0.0:0
I0906 20:01:45.260226 29071 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took
411069ns
I0906 20:01:45.260299 29071 leveldb.cpp:399] Deleting ~1 keys from leveldb took 40611ns
I0906 20:01:45.260321 29071 replica.cpp:708] Persisted action TRUNCATE at position 2
I0906 20:01:45.266494 29085 slave.cpp:205] Mesos agent started on @172.17.0.3:60366
I0906 20:01:45.266513 29085 slave.cpp:206] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://"
--appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="true" --authenticate_http_readwrite="true"
--authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false"
--cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false"
--cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_SRjbqt/credential"
--default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true"
--docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock"
--docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume"
--enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs"
--fetcher_cache_dir="/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_SRjbqt/fetch"
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1"
--hadoop_home="" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_command_executor="false"
--http_credentials="/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_SRjbqt/http_credentials"
--image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem"
--launcher_dir="/mesos/mesos-1.1.0/_build/src" --logbufsecs="0" --logging_level="INFO" --oversubscribed_resources_interval="15secs"
--perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false"
--recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="[{"name":"cpus","role":"*","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":2048.0},"type":"SCALAR"},{"name":"disk","role":"role1","scalar":{"value":4096.0},"type":"SCALAR"}]"
--revocable_cpu_low_priority="true" --runtime_dir="/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_SRjbqt"
--sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_enable_support="true"
--systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_DFKGtZ"
I0906 20:01:45.266980 29085 credentials.hpp:86] Loading credential for authentication from
'/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_SRjbqt/credential'
I0906 20:01:45.267125 29085 slave.cpp:343] Agent using credential for: test-principal
I0906 20:01:45.267143 29085 credentials.hpp:37] Loading credentials for authentication from
'/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_SRjbqt/http_credentials'
I0906 20:01:45.267366 29085 http.cpp:883] Using default 'basic' HTTP authenticator for realm
'mesos-agent-readonly'
I0906 20:01:45.267477 29085 http.cpp:883] Using default 'basic' HTTP authenticator for realm
'mesos-agent-readwrite'
I0906 20:01:45.267544 29051 sched.cpp:226] Version: 1.1.0
I0906 20:01:45.268095 29074 sched.cpp:330] New master detected at master@172.17.0.3:60366
I0906 20:01:45.268167 29074 sched.cpp:396] Authenticating with master master@172.17.0.3:60366
I0906 20:01:45.268182 29074 sched.cpp:403] Using default CRAM-MD5 authenticatee
I0906 20:01:45.268357 29078 authenticatee.cpp:121] Creating new client SASL connection
I0906 20:01:45.268568 29077 master.cpp:6167] Authenticating scheduler-d55313b3-c4cf-4517-843c-56aa3f74d9f7@172.17.0.3:60366
I0906 20:01:45.268654 29076 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(1048)@172.17.0.3:60366
I0906 20:01:45.268726 29085 slave.cpp:526] Agent resources: cpus(*):2; mem(*):2048; disk(role1):4096;
ports(*):[31000-32000]
I0906 20:01:45.268831 29085 slave.cpp:534] Agent attributes: [  ]
I0906 20:01:45.268847 29085 slave.cpp:539] Agent hostname: 0a1dc2da838b
I0906 20:01:45.268853 29080 authenticator.cpp:98] Creating new server SASL connection
I0906 20:01:45.269053 29071 authenticatee.cpp:213] Received SASL authentication mechanisms:
CRAM-MD5
I0906 20:01:45.269075 29071 authenticatee.cpp:239] Attempting to authenticate with mechanism
'CRAM-MD5'
I0906 20:01:45.269160 29077 authenticator.cpp:204] Received SASL authentication start
I0906 20:01:45.269218 29077 authenticator.cpp:326] Authentication requires more steps
I0906 20:01:45.269314 29079 authenticatee.cpp:259] Received SASL authentication step
I0906 20:01:45.269420 29081 authenticator.cpp:232] Received SASL authentication step
I0906 20:01:45.269450 29081 auxprop.cpp:109] Request to lookup properties for user: 'test-principal'
realm: '0a1dc2da838b' server FQDN: '0a1dc2da838b' SASL_AUXPROP_VERIFY_AGAINST_HASH: false
SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
I0906 20:01:45.269464 29081 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
I0906 20:01:45.269490 29081 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0906 20:01:45.269506 29081 auxprop.cpp:109] Request to lookup properties for user: 'test-principal'
realm: '0a1dc2da838b' server FQDN: '0a1dc2da838b' SASL_AUXPROP_VERIFY_AGAINST_HASH: false
SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
I0906 20:01:45.269515 29081 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since
SASL_AUXPROP_AUTHZID == true
I0906 20:01:45.269521 29081 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5'
since SASL_AUXPROP_AUTHZID == true
I0906 20:01:45.269534 29081 authenticator.cpp:318] Authentication success
I0906 20:01:45.269620 29070 authenticatee.cpp:299] Authentication success
I0906 20:01:45.269661 29084 master.cpp:6197] Successfully authenticated principal 'test-principal'
at scheduler-d55313b3-c4cf-4517-843c-56aa3f74d9f7@172.17.0.3:60366
I0906 20:01:45.269729 29074 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(1048)@172.17.0.3:60366
I0906 20:01:45.269861 29071 sched.cpp:502] Successfully authenticated with master master@172.17.0.3:60366
I0906 20:01:45.269877 29071 sched.cpp:820] Sending SUBSCRIBE call to master@172.17.0.3:60366
I0906 20:01:45.269948 29071 sched.cpp:853] Will retry registration in 1.200847472secs if necessary
I0906 20:01:45.270069 29084 master.cpp:2424] Received SUBSCRIBE call for framework 'default'
at scheduler-d55313b3-c4cf-4517-843c-56aa3f74d9f7@172.17.0.3:60366
I0906 20:01:45.270113 29084 master.cpp:1886] Authorizing framework principal 'test-principal'
to receive offers for role 'role1'
I0906 20:01:45.270314 29072 state.cpp:57] Recovering state from '/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_DFKGtZ/meta'
I0906 20:01:45.270467 29070 master.cpp:2500] Subscribing framework default with checkpointing
disabled and capabilities [  ]
I0906 20:01:45.270505 29075 status_update_manager.cpp:203] Recovering status update manager
I0906 20:01:45.270777 29081 slave.cpp:4887] Finished recovery
I0906 20:01:45.270908 29074 sched.cpp:743] Framework registered with 9fd91e5d-4257-427d-a7da-3f18d99c8ffa-0000
I0906 20:01:45.270942 29074 sched.cpp:757] Scheduler::registered took 15584ns
I0906 20:01:45.270970 29084 hierarchical.cpp:269] Added framework 9fd91e5d-4257-427d-a7da-3f18d99c8ffa-0000
I0906 20:01:45.271028 29084 hierarchical.cpp:1550] No allocations performed
I0906 20:01:45.271051 29084 hierarchical.cpp:1645] No inverse offers to send out!
I0906 20:01:45.271092 29084 hierarchical.cpp:1194] Performed allocation for 0 agents in 102494ns
I0906 20:01:45.271229 29081 slave.cpp:5059] Querying resource estimator for oversubscribable
resources
I0906 20:01:45.271414 29075 status_update_manager.cpp:177] Pausing sending status updates
I0906 20:01:45.271414 29081 slave.cpp:902] New master detected at master@172.17.0.3:60366
I0906 20:01:46.238718 29073 hierarchical.cpp:1550] No allocations performed
I0906 20:02:00.269846 29071 master.cpp:1288] Framework 9fd91e5d-4257-427d-a7da-3f18d99c8ffa-0000
(default) at scheduler-d55313b3-c4cf-4517-843c-56aa3f74d9f7@172.17.0.3:60366 disconnected
I0906 20:02:07.263937 29073 hierarchical.cpp:1645] No inverse offers to send out!
I0906 20:02:07.263902 29071 master.cpp:2725] Disconnecting framework 9fd91e5d-4257-427d-a7da-3f18d99c8ffa-0000
(default) at scheduler-d55313b3-c4cf-4517-843c-56aa3f74d9f7@172.17.0.3:60366
I0906 20:02:07.264065 29071 master.cpp:2749] Deactivating framework 9fd91e5d-4257-427d-a7da-3f18d99c8ffa-0000
(default) at scheduler-d55313b3-c4cf-4517-843c-56aa3f74d9f7@172.17.0.3:60366
I0906 20:02:07.264094 29073 hierarchical.cpp:1194] Performed allocation for 0 agents in 21.025474006secs
I0906 20:02:07.264175 29071 master.cpp:1301] Giving framework 9fd91e5d-4257-427d-a7da-3f18d99c8ffa-0000
(default) at scheduler-d55313b3-c4cf-4517-843c-56aa3f74d9f7@172.17.0.3:60366 0ns to failover
I0906 20:02:07.264307 29073 hierarchical.cpp:380] Deactivated framework 9fd91e5d-4257-427d-a7da-3f18d99c8ffa-0000
I0906 20:02:07.264336 29071 master.cpp:1096] Master terminating
*** Aborted at 1473192127 (unix time) try "date -d @1473192127" if you are using GNU date
***
PC: @     0x2ac83e9ac40b (unknown)
*** SIGSEGV (@0x2ac880049000) received by PID 29051 (TID 0x2ac848bc4700) from PID 18446744071562366976;
stack trace: ***
    @     0x2ac8947d62c7 (unknown)
    @     0x2ac8947da5a9 (unknown)
I0906 20:02:07.269142 29085 hierarchical.cpp:331] Removed framework 9fd91e5d-4257-427d-a7da-3f18d99c8ffa-0000
    @     0x2ac83f13f330 (unknown)
    @     0x2ac83e9ac40b (unknown)
    @     0x2ac83e9a3c05 (unknown)
I0906 20:02:07.274950 29051 cluster.cpp:157] Creating default 'local' authorizer
    @     0x2ac83d042c98 process::operator<<()
I0906 20:02:07.277822 29051 leveldb.cpp:174] Opened db in 2.422111ms
I0906 20:02:07.279304 29051 leveldb.cpp:181] Compacted db in 1.434065ms
I0906 20:02:07.279400 29051 leveldb.cpp:196] Created db iterator in 26692ns
I0906 20:02:07.279427 29051 leveldb.cpp:202] Seeked to beginning of db in 2257ns
I0906 20:02:07.279448 29051 leveldb.cpp:271] Iterated through 0 keys in the db in 362ns
I0906 20:02:07.279505 29051 replica.cpp:776] Replica recovered with log positions 0 ->
0 with 1 holes and 0 unlearned
I0906 20:02:07.280604 29079 recover.cpp:451] Starting replica recovery
I0906 20:02:07.281153 29079 recover.cpp:477] Replica is in EMPTY status
I0906 20:02:07.282649 29071 replica.cpp:673] Replica in EMPTY status received a broadcasted
recover request from __req_res__(6365)@172.17.0.3:60366
I0906 20:02:07.283185 29076 recover.cpp:197] Received a recover response from a replica in
EMPTY status
I0906 20:02:07.283640 29070 recover.cpp:568] Updating replica status to STARTING
I0906 20:02:07.284180 29071 master.cpp:379] Master f6076bbd-3be2-4c01-b593-d50e2743a2c9 (0a1dc2da838b)
started on 172.17.0.3:60366
I0906 20:02:07.284554 29075 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took
654887ns
I0906 20:02:07.284205 29071 master.cpp:381] Flags at startup: --acls="" --agent_ping_timeout="15secs"
--agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF"
--authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true"
--authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5"
--authorizers="local" --credentials="/tmp/WfTwZm/credentials" --framework_sorter="drf" --help="false"
--hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic"
--initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO"
--max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000"
--quiet="false" --recovery_agent_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins"
--registry_store_timeout="100secs" --registry_strict="true" --root_submissions="true" --user_sorter="drf"
--version="false" --webui_dir="/mesos/mesos-1.1.0/_inst/share/mesos/webui" --work_dir="/tmp/WfTwZm/master"
--zk_session_timeout="10secs"
I0906 20:02:07.284587 29075 replica.cpp:320] Persisted replica status to STARTING
I0906 20:02:07.284613 29071 master.cpp:431] Master only allowing authenticated frameworks
to register
I0906 20:02:07.284627 29071 master.cpp:445] Master only allowing authenticated agents to register
I0906 20:02:07.284636 29071 master.cpp:458] Master only allowing authenticated HTTP frameworks
to register
I0906 20:02:07.284644 29071 credentials.hpp:37] Loading credentials for authentication from
'/tmp/WfTwZm/credentials'
I0906 20:02:07.284814 29078 recover.cpp:477] Replica is in STARTING status
I0906 20:02:07.284943 29071 master.cpp:503] Using default 'crammd5' authenticator
I0906 20:02:07.285138 29071 http.cpp:883] Using default 'basic' HTTP authenticator for realm
'mesos-master-readonly'
I0906 20:02:07.285303 29071 http.cpp:883] Using default 'basic' HTTP authenticator for realm
'mesos-master-readwrite'
I0906 20:02:07.285500 29071 http.cpp:883] Using default 'basic' HTTP authenticator for realm
'mesos-master-scheduler'
I0906 20:02:07.285640 29071 master.cpp:583] Authorization enabled
I0906 20:02:07.285848 29072 whitelist_watcher.cpp:77] No whitelist given
I0906 20:02:07.286067 29083 hierarchical.cpp:149] Initialized hierarchical allocator process
I0906 20:02:07.286173 29073 replica.cpp:673] Replica in STARTING status received a broadcasted
recover request from __req_res__(6366)@172.17.0.3:60366
I0906 20:02:07.286520 29082 recover.cpp:197] Received a recover response from a replica in
STARTING status
I0906 20:02:07.287076 29073 recover.cpp:568] Updating replica status to VOTING
I0906 20:02:07.287904 29084 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took
597451ns
I0906 20:02:07.287938 29084 replica.cpp:320] Persisted replica status to VOTING
I0906 20:02:07.288169 29076 recover.cpp:582] Successfully joined the Paxos group
I0906 20:02:07.288481 29076 recover.cpp:466] Recover process terminated
I0906 20:02:07.289659 29084 master.cpp:1850] Elected as the leading master!
I0906 20:02:07.289693 29084 master.cpp:1551] Recovering from registrar
I0906 20:02:07.289862 29079 registrar.cpp:332] Recovering registrar
I0906 20:02:07.290505 29075 log.cpp:553] Attempting to start the writer
I0906 20:02:07.292006 29074 replica.cpp:493] Replica received implicit promise request from
__req_res__(6367)@172.17.0.3:60366 with proposal 1
    @     0x2ac83c44fac3 mesos::internal::slave::Slave::authenticate()
I0906 20:02:07.292558 29074 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took
508694ns
I0906 20:02:07.292584 29074 replica.cpp:342] Persisted promised to 1
I0906 20:02:07.293391 29080 coordinator.cpp:238] Coordinator attempting to fill missing positions
I0906 20:02:07.294734 29073 replica.cpp:388] Replica received explicit promise request from
__req_res__(6368)@172.17.0.3:60366 for position 0 with proposal 2
I0906 20:02:07.295254 29073 leveldb.cpp:341] Persisting action (8 bytes) to leveldb took 472361ns
I0906 20:02:07.295285 29073 replica.cpp:708] Persisted action NOP at position 0
I0906 20:02:07.296751 29076 replica.cpp:537] Replica received write request for position 0
from __req_res__(6369)@172.17.0.3:60366
I0906 20:02:07.296835 29076 leveldb.cpp:436] Reading position from leveldb took 39744ns
I0906 20:02:07.297452 29076 leveldb.cpp:341] Persisting action (14 bytes) to leveldb took
554740ns
I0906 20:02:07.297485 29076 replica.cpp:708] Persisted action NOP at position 0
I0906 20:02:07.298262 29083 replica.cpp:691] Replica received learned notice for position
0 from @0.0.0.0:0
I0906 20:02:07.298765 29083 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took
460819ns
I0906 20:02:07.298796 29083 replica.cpp:708] Persisted action NOP at position 0
    @     0x2ac83c44f56b mesos::internal::slave::Slave::detected()
I0906 20:02:07.299576 29085 log.cpp:569] Writer started with ending position 0
I0906 20:02:07.300812 29071 leveldb.cpp:436] Reading position from leveldb took 31797ns
I0906 20:02:07.301996 29073 registrar.cpp:365] Successfully fetched the registry (0B) in 12.048896ms
I0906 20:02:07.302140 29073 registrar.cpp:464] Applied 1 operations in 32924ns; attempting
to update the registry
I0906 20:02:07.303042 29078 log.cpp:577] Attempting to append 168 bytes to the log
I0906 20:02:07.303190 29079 coordinator.cpp:348] Coordinator attempting to write APPEND action
at position 1
    @     0x2ac83c4a5d03 _ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS_6FutureI6OptionINS1_10MasterInfoEEEES9_EEvRKNS_3PIDIT_EEMSD_FvT0_ET1_ENKUlPNS_11ProcessBaseEE_clESM_
I0906 20:02:07.304149 29076 replica.cpp:537] Replica received write request for position 1
from __req_res__(6370)@172.17.0.3:60366
I0906 20:02:07.304754 29076 leveldb.cpp:341] Persisting action (187 bytes) to leveldb took
546211ns
I0906 20:02:07.304786 29076 replica.cpp:708] Persisted action APPEND at position 1
I0906 20:02:07.305613 29078 replica.cpp:691] Replica received learned notice for position
1 from @0.0.0.0:0
I0906 20:02:07.306145 29078 leveldb.cpp:341] Persisting action (189 bytes) to leveldb took
490605ns
I0906 20:02:07.306182 29078 replica.cpp:708] Persisted action APPEND at position 1
I0906 20:02:07.307394 29070 registrar.cpp:509] Successfully updated the registry in 5.172736ms
I0906 20:02:07.307579 29070 registrar.cpp:395] Successfully recovered registrar
I0906 20:02:07.307659 29085 log.cpp:596] Attempting to truncate the log to 1
I0906 20:02:07.307802 29073 coordinator.cpp:348] Coordinator attempting to write TRUNCATE
action at position 2
I0906 20:02:07.308280 29072 master.cpp:1659] Recovered 0 agents from the registry (129B);
allowing 10mins for agents to re-register
I0906 20:02:07.308377 29085 hierarchical.cpp:176] Skipping recovery of hierarchical allocator:
nothing to recover
I0906 20:02:07.309029 29073 replica.cpp:537] Replica received write request for position 2
from __req_res__(6371)@172.17.0.3:60366
I0906 20:02:07.309675 29073 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took
528589ns
I0906 20:02:07.309706 29073 replica.cpp:708] Persisted action TRUNCATE at position 2
I0906 20:02:07.310412 29082 replica.cpp:691] Replica received learned notice for position
2 from @0.0.0.0:0
I0906 20:02:07.310714 29082 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took
272545ns
I0906 20:02:07.310772 29082 leveldb.cpp:399] Deleting ~1 keys from leveldb took 33082ns
I0906 20:02:07.310802 29082 replica.cpp:708] Persisted action TRUNCATE at position 2
    @     0x2ac83c4d821e _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureI6OptionINS5_10MasterInfoEEEESD_EEvRKNS0_3PIDIT_EEMSH_FvT0_ET1_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
    @     0x2ac83d085c43 std::function<>::operator()()
    @     0x2ac83d068bcb process::ProcessBase::visit()
    @     0x2ac83d070fe0 process::DispatchEvent::visit()
    @           0xa196b2 process::ProcessBase::serve()
    @     0x2ac83d064ec0 process::ProcessManager::resume()
    @     0x2ac83d061b2d _ZZN7process14ProcessManager12init_threadsEvENKUt_clEv
    @     0x2ac83d070788 _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE
    @     0x2ac83d0706df _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEclEv
    @     0x2ac83d070678 _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv
    @     0x2ac83e9c0a60 (unknown)
    @     0x2ac83f137184 start_thread
    @     0x2ac83f44737d (unknown)
make[4]: *** [check-local] Segmentation fault
{code}

It looks like the framework disconnects and the master shuts down prematurely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message