Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2A6D1200B88 for ; Thu, 8 Sep 2016 06:39:24 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 29387160AD2; Thu, 8 Sep 2016 04:39:24 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CEBA6160AC1 for ; Thu, 8 Sep 2016 06:39:22 +0200 (CEST) Received: (qmail 95608 invoked by uid 500); 8 Sep 2016 04:39:20 -0000 Mailing-List: contact issues-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list issues@mesos.apache.org Received: (qmail 95550 invoked by uid 99); 8 Sep 2016 04:39:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Sep 2016 04:39:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 9F1CA2C014F for ; Thu, 8 Sep 2016 04:39:20 +0000 (UTC) Date: Thu, 8 Sep 2016 04:39:20 +0000 (UTC) From: "Greg Mann (JIRA)" To: issues@mesos.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (MESOS-6137) Segfault during DiskResource/PersistentVolumeTest.IncompatibleCheckpointedResources/0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 08 Sep 2016 04:39:24 -0000 Greg Mann created MESOS-6137: -------------------------------- Summary: Segfault during DiskResource/PersistentVolumeTest.Inc= ompatibleCheckpointedResources/0 Key: MESOS-6137 URL: https://issues.apache.org/jira/browse/MESOS-6137 Project: Mesos Issue Type: Bug Affects Versions: 1.0.1 Environment: Ubuntu 14.04, non-SSL, libev Reporter: Greg Mann Assignee: Greg Mann Observed in our internal CI: {code} I0906 20:01:45.235483 29082 master.cpp:379] Master 9fd91e5d-4257-427d-a7da-= 3f18d99c8ffa (0a1dc2da838b) started on 172.17.0.3:60366 I0906 20:01:45.235513 29082 master.cpp:381] Flags at startup: --acls=3D"" -= -agent_ping_timeout=3D"15secs" --agent_reregister_timeout=3D"10mins" --allo= cation_interval=3D"1secs" --allocator=3D"HierarchicalDRF" --authenticate_ag= ents=3D"true" --authenticate_frameworks=3D"true" --authenticate_http_framew= orks=3D"true" --authenticate_http_readonly=3D"true" --authenticate_http_rea= dwrite=3D"true" --authenticators=3D"crammd5" --authorizers=3D"local" --cred= entials=3D"/tmp/ze1TG1/credentials" --framework_sorter=3D"drf" --help=3D"fa= lse" --hostname_lookup=3D"true" --http_authenticators=3D"basic" --http_fram= ework_authenticators=3D"basic" --initialize_driver_logging=3D"true" --log_a= uto_initialize=3D"true" --logbufsecs=3D"0" --logging_level=3D"INFO" --max_a= gent_ping_timeouts=3D"5" --max_completed_frameworks=3D"50" --max_completed_= tasks_per_framework=3D"1000" --quiet=3D"false" --recovery_agent_removal_lim= it=3D"100%" --registry=3D"replicated_log" --registry_fetch_timeout=3D"1mins= " --registry_store_timeout=3D"100secs" --registry_strict=3D"true" --root_su= bmissions=3D"true" --user_sorter=3D"drf" --version=3D"false" --webui_dir=3D= "/mesos/mesos-1.1.0/_inst/share/mesos/webui" --work_dir=3D"/tmp/ze1TG1/mast= er" --zk_session_timeout=3D"10secs" I0906 20:01:45.236022 29082 master.cpp:431] Master only allowing authentica= ted frameworks to register I0906 20:01:45.236037 29082 master.cpp:445] Master only allowing authentica= ted agents to register I0906 20:01:45.236045 29082 master.cpp:458] Master only allowing authentica= ted HTTP frameworks to register I0906 20:01:45.236054 29082 credentials.hpp:37] Loading credentials for aut= hentication from '/tmp/ze1TG1/credentials' I0906 20:01:45.236392 29082 master.cpp:503] Using default 'crammd5' authent= icator I0906 20:01:45.236654 29079 replica.cpp:673] Replica in STARTING status rec= eived a broadcasted recover request from __req_res__(6359)@172.17.0.3:60366 I0906 20:01:45.236687 29082 http.cpp:883] Using default 'basic' HTTP authen= ticator for realm 'mesos-master-readonly' I0906 20:01:45.236927 29082 http.cpp:883] Using default 'basic' HTTP authen= ticator for realm 'mesos-master-readwrite' I0906 20:01:45.237095 29079 recover.cpp:197] Received a recover response fr= om a replica in STARTING status I0906 20:01:45.237117 29082 http.cpp:883] Using default 'basic' HTTP authen= ticator for realm 'mesos-master-scheduler' I0906 20:01:45.237340 29082 master.cpp:583] Authorization enabled I0906 20:01:45.237663 29080 whitelist_watcher.cpp:77] No whitelist given I0906 20:01:45.237685 29075 hierarchical.cpp:149] Initialized hierarchical = allocator process I0906 20:01:45.237835 29085 recover.cpp:568] Updating replica status to VOT= ING I0906 20:01:45.238531 29081 leveldb.cpp:304] Persisting metadata (8 bytes) = to leveldb took 378674ns I0906 20:01:45.238560 29081 replica.cpp:320] Persisted replica status to VO= TING I0906 20:01:45.238685 29073 recover.cpp:582] Successfully joined the Paxos = group I0906 20:01:45.238975 29073 recover.cpp:466] Recover process terminated I0906 20:01:45.240437 29078 master.cpp:1850] Elected as the leading master! I0906 20:01:45.240468 29078 master.cpp:1551] Recovering from registrar I0906 20:01:45.240592 29080 registrar.cpp:332] Recovering registrar I0906 20:01:45.241178 29075 log.cpp:553] Attempting to start the writer I0906 20:01:45.242928 29072 replica.cpp:493] Replica received implicit prom= ise request from __req_res__(6360)@172.17.0.3:60366 with proposal 1 I0906 20:01:45.243324 29072 leveldb.cpp:304] Persisting metadata (8 bytes) = to leveldb took 335676ns I0906 20:01:45.243350 29072 replica.cpp:342] Persisted promised to 1 I0906 20:01:45.244056 29081 coordinator.cpp:238] Coordinator attempting to = fill missing positions I0906 20:01:45.245538 29078 replica.cpp:388] Replica received explicit prom= ise request from __req_res__(6361)@172.17.0.3:60366 for position 0 with pro= posal 2 I0906 20:01:45.245995 29078 leveldb.cpp:341] Persisting action (8 bytes) to= leveldb took 412163ns I0906 20:01:45.246021 29078 replica.cpp:708] Persisted action NOP at positi= on 0 I0906 20:01:45.247329 29082 replica.cpp:537] Replica received write request= for position 0 from __req_res__(6362)@172.17.0.3:60366 I0906 20:01:45.247406 29082 leveldb.cpp:436] Reading position from leveldb = took 35845ns I0906 20:01:45.247989 29082 leveldb.cpp:341] Persisting action (14 bytes) t= o leveldb took 541972ns I0906 20:01:45.248015 29082 replica.cpp:708] Persisted action NOP at positi= on 0 I0906 20:01:45.248556 29084 replica.cpp:691] Replica received learned notic= e for position 0 from @0.0.0.0:0 I0906 20:01:45.249241 29084 leveldb.cpp:341] Persisting action (16 bytes) t= o leveldb took 647885ns I0906 20:01:45.249271 29084 replica.cpp:708] Persisted action NOP at positi= on 0 I0906 20:01:45.249914 29085 log.cpp:569] Writer started with ending positio= n 0 I0906 20:01:45.251022 29085 leveldb.cpp:436] Reading position from leveldb = took 31388ns I0906 20:01:45.252149 29082 registrar.cpp:365] Successfully fetched the reg= istry (0B) in 11.51104ms I0906 20:01:45.252271 29082 registrar.cpp:464] Applied 1 operations in 2134= 1ns; attempting to update the registry I0906 20:01:45.253073 29078 log.cpp:577] Attempting to append 168 bytes to = the log I0906 20:01:45.253250 29081 coordinator.cpp:348] Coordinator attempting to = write APPEND action at position 1 I0906 20:01:45.254175 29070 replica.cpp:537] Replica received write request= for position 1 from __req_res__(6363)@172.17.0.3:60366 I0906 20:01:45.254654 29070 leveldb.cpp:341] Persisting action (187 bytes) = to leveldb took 435222ns I0906 20:01:45.254683 29070 replica.cpp:708] Persisted action APPEND at pos= ition 1 I0906 20:01:45.255455 29080 replica.cpp:691] Replica received learned notic= e for position 1 from @0.0.0.0:0 I0906 20:01:45.255926 29080 leveldb.cpp:341] Persisting action (189 bytes) = to leveldb took 431510ns I0906 20:01:45.255980 29080 replica.cpp:708] Persisted action APPEND at pos= ition 1 I0906 20:01:45.257114 29073 registrar.cpp:509] Successfully updated the reg= istry in 4.780032ms I0906 20:01:45.257305 29073 registrar.cpp:395] Successfully recovered regis= trar I0906 20:01:45.257380 29082 log.cpp:596] Attempting to truncate the log to = 1 I0906 20:01:45.257515 29076 coordinator.cpp:348] Coordinator attempting to = write TRUNCATE action at position 2 I0906 20:01:45.258153 29071 master.cpp:1659] Recovered 0 agents from the re= gistry (129B); allowing 10mins for agents to re-register I0906 20:01:45.258191 29077 hierarchical.cpp:176] Skipping recovery of hier= archical allocator: nothing to recover I0906 20:01:45.258608 29082 replica.cpp:537] Replica received write request= for position 2 from __req_res__(6364)@172.17.0.3:60366 I0906 20:01:45.259039 29082 leveldb.cpp:341] Persisting action (16 bytes) t= o leveldb took 388229ns I0906 20:01:45.259068 29082 replica.cpp:708] Persisted action TRUNCATE at p= osition 2 I0906 20:01:45.259778 29071 replica.cpp:691] Replica received learned notic= e for position 2 from @0.0.0.0:0 I0906 20:01:45.260226 29071 leveldb.cpp:341] Persisting action (18 bytes) t= o leveldb took 411069ns I0906 20:01:45.260299 29071 leveldb.cpp:399] Deleting ~1 keys from leveldb = took 40611ns I0906 20:01:45.260321 29071 replica.cpp:708] Persisted action TRUNCATE at p= osition 2 I0906 20:01:45.266494 29085 slave.cpp:205] Mesos agent started on @172.17.0= .3:60366 I0906 20:01:45.266513 29085 slave.cpp:206] Flags at startup: --acls=3D"" --= appc_simple_discovery_uri_prefix=3D"http://" --appc_store_dir=3D"/tmp/mesos= /store/appc" --authenticate_http_readonly=3D"true" --authenticate_http_read= write=3D"true" --authenticatee=3D"crammd5" --authentication_backoff_factor= =3D"1secs" --authorizer=3D"local" --cgroups_cpu_enable_pids_and_tids_count= =3D"false" --cgroups_enable_cfs=3D"false" --cgroups_hierarchy=3D"/sys/fs/cg= roup" --cgroups_limit_swap=3D"false" --cgroups_root=3D"mesos" --container_d= isk_watch_interval=3D"15secs" --containerizers=3D"mesos" --credential=3D"/t= mp/DiskResource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_SR= jbqt/credential" --default_role=3D"*" --disk_watch_interval=3D"1mins" --doc= ker=3D"docker" --docker_kill_orphans=3D"true" --docker_registry=3D"https://= registry-1.docker.io" --docker_remove_delay=3D"6hrs" --docker_socket=3D"/va= r/run/docker.sock" --docker_stop_timeout=3D"0ns" --docker_store_dir=3D"/tmp= /mesos/store/docker" --docker_volume_checkpoint_dir=3D"/var/run/mesos/isola= tors/docker/volume" --enforce_container_disk_quota=3D"false" --executor_reg= istration_timeout=3D"1mins" --executor_shutdown_grace_period=3D"5secs" --fe= tcher_cache_dir=3D"/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheck= pointedResources_0_SRjbqt/fetch" --fetcher_cache_size=3D"2GB" --frameworks_= home=3D"" --gc_delay=3D"1weeks" --gc_disk_headroom=3D"0.1" --hadoop_home=3D= "" --help=3D"false" --hostname_lookup=3D"true" --http_authenticators=3D"bas= ic" --http_command_executor=3D"false" --http_credentials=3D"/tmp/DiskResour= ce_PersistentVolumeTest_IncompatibleCheckpointedResources_0_SRjbqt/http_cre= dentials" --image_provisioner_backend=3D"copy" --initialize_driver_logging= =3D"true" --isolation=3D"posix/cpu,posix/mem" --launcher_dir=3D"/mesos/meso= s-1.1.0/_build/src" --logbufsecs=3D"0" --logging_level=3D"INFO" --oversubsc= ribed_resources_interval=3D"15secs" --perf_duration=3D"10secs" --perf_inter= val=3D"1mins" --qos_correction_interval_min=3D"0ns" --quiet=3D"false" --rec= over=3D"reconnect" --recovery_timeout=3D"15mins" --registration_backoff_fac= tor=3D"10ms" --resources=3D"[{"name":"cpus","role":"*","scalar":{"value":2.= 0},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":2048.0},"typ= e":"SCALAR"},{"name":"disk","role":"role1","scalar":{"value":4096.0},"type"= :"SCALAR"}]" --revocable_cpu_low_priority=3D"true" --runtime_dir=3D"/tmp/Di= skResource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_SRjbqt"= --sandbox_directory=3D"/mnt/mesos/sandbox" --strict=3D"true" --switch_user= =3D"true" --systemd_enable_support=3D"true" --systemd_runtime_directory=3D"= /run/systemd/system" --version=3D"false" --work_dir=3D"/tmp/DiskResource_Pe= rsistentVolumeTest_IncompatibleCheckpointedResources_0_DFKGtZ" I0906 20:01:45.266980 29085 credentials.hpp:86] Loading credential for auth= entication from '/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheckpo= intedResources_0_SRjbqt/credential' I0906 20:01:45.267125 29085 slave.cpp:343] Agent using credential for: test= -principal I0906 20:01:45.267143 29085 credentials.hpp:37] Loading credentials for aut= hentication from '/tmp/DiskResource_PersistentVolumeTest_IncompatibleCheckp= ointedResources_0_SRjbqt/http_credentials' I0906 20:01:45.267366 29085 http.cpp:883] Using default 'basic' HTTP authen= ticator for realm 'mesos-agent-readonly' I0906 20:01:45.267477 29085 http.cpp:883] Using default 'basic' HTTP authen= ticator for realm 'mesos-agent-readwrite' I0906 20:01:45.267544 29051 sched.cpp:226] Version: 1.1.0 I0906 20:01:45.268095 29074 sched.cpp:330] New master detected at master@17= 2.17.0.3:60366 I0906 20:01:45.268167 29074 sched.cpp:396] Authenticating with master maste= r@172.17.0.3:60366 I0906 20:01:45.268182 29074 sched.cpp:403] Using default CRAM-MD5 authentic= atee I0906 20:01:45.268357 29078 authenticatee.cpp:121] Creating new client SASL= connection I0906 20:01:45.268568 29077 master.cpp:6167] Authenticating scheduler-d5531= 3b3-c4cf-4517-843c-56aa3f74d9f7@172.17.0.3:60366 I0906 20:01:45.268654 29076 authenticator.cpp:414] Starting authentication = session for crammd5-authenticatee(1048)@172.17.0.3:60366 I0906 20:01:45.268726 29085 slave.cpp:526] Agent resources: cpus(*):2; mem(= *):2048; disk(role1):4096; ports(*):[31000-32000] I0906 20:01:45.268831 29085 slave.cpp:534] Agent attributes: [ ] I0906 20:01:45.268847 29085 slave.cpp:539] Agent hostname: 0a1dc2da838b I0906 20:01:45.268853 29080 authenticator.cpp:98] Creating new server SASL = connection I0906 20:01:45.269053 29071 authenticatee.cpp:213] Received SASL authentica= tion mechanisms: CRAM-MD5 I0906 20:01:45.269075 29071 authenticatee.cpp:239] Attempting to authentica= te with mechanism 'CRAM-MD5' I0906 20:01:45.269160 29077 authenticator.cpp:204] Received SASL authentica= tion start I0906 20:01:45.269218 29077 authenticator.cpp:326] Authentication requires = more steps I0906 20:01:45.269314 29079 authenticatee.cpp:259] Received SASL authentica= tion step I0906 20:01:45.269420 29081 authenticator.cpp:232] Received SASL authentica= tion step I0906 20:01:45.269450 29081 auxprop.cpp:109] Request to lookup properties f= or user: 'test-principal' realm: '0a1dc2da838b' server FQDN: '0a1dc2da838b'= SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_= AUXPROP_AUTHZID: false=20 I0906 20:01:45.269464 29081 auxprop.cpp:181] Looking up auxiliary property = '*userPassword' I0906 20:01:45.269490 29081 auxprop.cpp:181] Looking up auxiliary property = '*cmusaslsecretCRAM-MD5' I0906 20:01:45.269506 29081 auxprop.cpp:109] Request to lookup properties f= or user: 'test-principal' realm: '0a1dc2da838b' server FQDN: '0a1dc2da838b'= SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_= AUXPROP_AUTHZID: true=20 I0906 20:01:45.269515 29081 auxprop.cpp:131] Skipping auxiliary property '*= userPassword' since SASL_AUXPROP_AUTHZID =3D=3D true I0906 20:01:45.269521 29081 auxprop.cpp:131] Skipping auxiliary property '*= cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID =3D=3D true I0906 20:01:45.269534 29081 authenticator.cpp:318] Authentication success I0906 20:01:45.269620 29070 authenticatee.cpp:299] Authentication success I0906 20:01:45.269661 29084 master.cpp:6197] Successfully authenticated pri= ncipal 'test-principal' at scheduler-d55313b3-c4cf-4517-843c-56aa3f74d9f7@1= 72.17.0.3:60366 I0906 20:01:45.269729 29074 authenticator.cpp:432] Authentication session c= leanup for crammd5-authenticatee(1048)@172.17.0.3:60366 I0906 20:01:45.269861 29071 sched.cpp:502] Successfully authenticated with = master master@172.17.0.3:60366 I0906 20:01:45.269877 29071 sched.cpp:820] Sending SUBSCRIBE call to master= @172.17.0.3:60366 I0906 20:01:45.269948 29071 sched.cpp:853] Will retry registration in 1.200= 847472secs if necessary I0906 20:01:45.270069 29084 master.cpp:2424] Received SUBSCRIBE call for fr= amework 'default' at scheduler-d55313b3-c4cf-4517-843c-56aa3f74d9f7@172.17.= 0.3:60366 I0906 20:01:45.270113 29084 master.cpp:1886] Authorizing framework principa= l 'test-principal' to receive offers for role 'role1' I0906 20:01:45.270314 29072 state.cpp:57] Recovering state from '/tmp/DiskR= esource_PersistentVolumeTest_IncompatibleCheckpointedResources_0_DFKGtZ/met= a' I0906 20:01:45.270467 29070 master.cpp:2500] Subscribing framework default = with checkpointing disabled and capabilities [ ] I0906 20:01:45.270505 29075 status_update_manager.cpp:203] Recovering statu= s update manager I0906 20:01:45.270777 29081 slave.cpp:4887] Finished recovery I0906 20:01:45.270908 29074 sched.cpp:743] Framework registered with 9fd91e= 5d-4257-427d-a7da-3f18d99c8ffa-0000 I0906 20:01:45.270942 29074 sched.cpp:757] Scheduler::registered took 15584= ns I0906 20:01:45.270970 29084 hierarchical.cpp:269] Added framework 9fd91e5d-= 4257-427d-a7da-3f18d99c8ffa-0000 I0906 20:01:45.271028 29084 hierarchical.cpp:1550] No allocations performed I0906 20:01:45.271051 29084 hierarchical.cpp:1645] No inverse offers to sen= d out! I0906 20:01:45.271092 29084 hierarchical.cpp:1194] Performed allocation for= 0 agents in 102494ns I0906 20:01:45.271229 29081 slave.cpp:5059] Querying resource estimator for= oversubscribable resources I0906 20:01:45.271414 29075 status_update_manager.cpp:177] Pausing sending = status updates I0906 20:01:45.271414 29081 slave.cpp:902] New master detected at master@17= 2.17.0.3:60366 I0906 20:01:46.238718 29073 hierarchical.cpp:1550] No allocations performed I0906 20:02:00.269846 29071 master.cpp:1288] Framework 9fd91e5d-4257-427d-a= 7da-3f18d99c8ffa-0000 (default) at scheduler-d55313b3-c4cf-4517-843c-56aa3f= 74d9f7@172.17.0.3:60366 disconnected I0906 20:02:07.263937 29073 hierarchical.cpp:1645] No inverse offers to sen= d out! I0906 20:02:07.263902 29071 master.cpp:2725] Disconnecting framework 9fd91e= 5d-4257-427d-a7da-3f18d99c8ffa-0000 (default) at scheduler-d55313b3-c4cf-45= 17-843c-56aa3f74d9f7@172.17.0.3:60366 I0906 20:02:07.264065 29071 master.cpp:2749] Deactivating framework 9fd91e5= d-4257-427d-a7da-3f18d99c8ffa-0000 (default) at scheduler-d55313b3-c4cf-451= 7-843c-56aa3f74d9f7@172.17.0.3:60366 I0906 20:02:07.264094 29073 hierarchical.cpp:1194] Performed allocation for= 0 agents in 21.025474006secs I0906 20:02:07.264175 29071 master.cpp:1301] Giving framework 9fd91e5d-4257= -427d-a7da-3f18d99c8ffa-0000 (default) at scheduler-d55313b3-c4cf-4517-843c= -56aa3f74d9f7@172.17.0.3:60366 0ns to failover I0906 20:02:07.264307 29073 hierarchical.cpp:380] Deactivated framework 9fd= 91e5d-4257-427d-a7da-3f18d99c8ffa-0000 I0906 20:02:07.264336 29071 master.cpp:1096] Master terminating *** Aborted at 1473192127 (unix time) try "date -d @1473192127" if you are = using GNU date *** PC: @ 0x2ac83e9ac40b (unknown) *** SIGSEGV (@0x2ac880049000) received by PID 29051 (TID 0x2ac848bc4700) fr= om PID 18446744071562366976; stack trace: *** @ 0x2ac8947d62c7 (unknown) @ 0x2ac8947da5a9 (unknown) I0906 20:02:07.269142 29085 hierarchical.cpp:331] Removed framework 9fd91e5= d-4257-427d-a7da-3f18d99c8ffa-0000 @ 0x2ac83f13f330 (unknown) @ 0x2ac83e9ac40b (unknown) @ 0x2ac83e9a3c05 (unknown) I0906 20:02:07.274950 29051 cluster.cpp:157] Creating default 'local' autho= rizer @ 0x2ac83d042c98 process::operator<<() I0906 20:02:07.277822 29051 leveldb.cpp:174] Opened db in 2.422111ms I0906 20:02:07.279304 29051 leveldb.cpp:181] Compacted db in 1.434065ms I0906 20:02:07.279400 29051 leveldb.cpp:196] Created db iterator in 26692ns I0906 20:02:07.279427 29051 leveldb.cpp:202] Seeked to beginning of db in 2= 257ns I0906 20:02:07.279448 29051 leveldb.cpp:271] Iterated through 0 keys in the= db in 362ns I0906 20:02:07.279505 29051 replica.cpp:776] Replica recovered with log pos= itions 0 -> 0 with 1 holes and 0 unlearned I0906 20:02:07.280604 29079 recover.cpp:451] Starting replica recovery I0906 20:02:07.281153 29079 recover.cpp:477] Replica is in EMPTY status I0906 20:02:07.282649 29071 replica.cpp:673] Replica in EMPTY status receiv= ed a broadcasted recover request from __req_res__(6365)@172.17.0.3:60366 I0906 20:02:07.283185 29076 recover.cpp:197] Received a recover response fr= om a replica in EMPTY status I0906 20:02:07.283640 29070 recover.cpp:568] Updating replica status to STA= RTING I0906 20:02:07.284180 29071 master.cpp:379] Master f6076bbd-3be2-4c01-b593-= d50e2743a2c9 (0a1dc2da838b) started on 172.17.0.3:60366 I0906 20:02:07.284554 29075 leveldb.cpp:304] Persisting metadata (8 bytes) = to leveldb took 654887ns I0906 20:02:07.284205 29071 master.cpp:381] Flags at startup: --acls=3D"" -= -agent_ping_timeout=3D"15secs" --agent_reregister_timeout=3D"10mins" --allo= cation_interval=3D"1secs" --allocator=3D"HierarchicalDRF" --authenticate_ag= ents=3D"true" --authenticate_frameworks=3D"true" --authenticate_http_framew= orks=3D"true" --authenticate_http_readonly=3D"true" --authenticate_http_rea= dwrite=3D"true" --authenticators=3D"crammd5" --authorizers=3D"local" --cred= entials=3D"/tmp/WfTwZm/credentials" --framework_sorter=3D"drf" --help=3D"fa= lse" --hostname_lookup=3D"true" --http_authenticators=3D"basic" --http_fram= ework_authenticators=3D"basic" --initialize_driver_logging=3D"true" --log_a= uto_initialize=3D"true" --logbufsecs=3D"0" --logging_level=3D"INFO" --max_a= gent_ping_timeouts=3D"5" --max_completed_frameworks=3D"50" --max_completed_= tasks_per_framework=3D"1000" --quiet=3D"false" --recovery_agent_removal_lim= it=3D"100%" --registry=3D"replicated_log" --registry_fetch_timeout=3D"1mins= " --registry_store_timeout=3D"100secs" --registry_strict=3D"true" --root_su= bmissions=3D"true" --user_sorter=3D"drf" --version=3D"false" --webui_dir=3D= "/mesos/mesos-1.1.0/_inst/share/mesos/webui" --work_dir=3D"/tmp/WfTwZm/mast= er" --zk_session_timeout=3D"10secs" I0906 20:02:07.284587 29075 replica.cpp:320] Persisted replica status to ST= ARTING I0906 20:02:07.284613 29071 master.cpp:431] Master only allowing authentica= ted frameworks to register I0906 20:02:07.284627 29071 master.cpp:445] Master only allowing authentica= ted agents to register I0906 20:02:07.284636 29071 master.cpp:458] Master only allowing authentica= ted HTTP frameworks to register I0906 20:02:07.284644 29071 credentials.hpp:37] Loading credentials for aut= hentication from '/tmp/WfTwZm/credentials' I0906 20:02:07.284814 29078 recover.cpp:477] Replica is in STARTING status I0906 20:02:07.284943 29071 master.cpp:503] Using default 'crammd5' authent= icator I0906 20:02:07.285138 29071 http.cpp:883] Using default 'basic' HTTP authen= ticator for realm 'mesos-master-readonly' I0906 20:02:07.285303 29071 http.cpp:883] Using default 'basic' HTTP authen= ticator for realm 'mesos-master-readwrite' I0906 20:02:07.285500 29071 http.cpp:883] Using default 'basic' HTTP authen= ticator for realm 'mesos-master-scheduler' I0906 20:02:07.285640 29071 master.cpp:583] Authorization enabled I0906 20:02:07.285848 29072 whitelist_watcher.cpp:77] No whitelist given I0906 20:02:07.286067 29083 hierarchical.cpp:149] Initialized hierarchical = allocator process I0906 20:02:07.286173 29073 replica.cpp:673] Replica in STARTING status rec= eived a broadcasted recover request from __req_res__(6366)@172.17.0.3:60366 I0906 20:02:07.286520 29082 recover.cpp:197] Received a recover response fr= om a replica in STARTING status I0906 20:02:07.287076 29073 recover.cpp:568] Updating replica status to VOT= ING I0906 20:02:07.287904 29084 leveldb.cpp:304] Persisting metadata (8 bytes) = to leveldb took 597451ns I0906 20:02:07.287938 29084 replica.cpp:320] Persisted replica status to VO= TING I0906 20:02:07.288169 29076 recover.cpp:582] Successfully joined the Paxos = group I0906 20:02:07.288481 29076 recover.cpp:466] Recover process terminated I0906 20:02:07.289659 29084 master.cpp:1850] Elected as the leading master! I0906 20:02:07.289693 29084 master.cpp:1551] Recovering from registrar I0906 20:02:07.289862 29079 registrar.cpp:332] Recovering registrar I0906 20:02:07.290505 29075 log.cpp:553] Attempting to start the writer I0906 20:02:07.292006 29074 replica.cpp:493] Replica received implicit prom= ise request from __req_res__(6367)@172.17.0.3:60366 with proposal 1 @ 0x2ac83c44fac3 mesos::internal::slave::Slave::authenticate() I0906 20:02:07.292558 29074 leveldb.cpp:304] Persisting metadata (8 bytes) = to leveldb took 508694ns I0906 20:02:07.292584 29074 replica.cpp:342] Persisted promised to 1 I0906 20:02:07.293391 29080 coordinator.cpp:238] Coordinator attempting to = fill missing positions I0906 20:02:07.294734 29073 replica.cpp:388] Replica received explicit prom= ise request from __req_res__(6368)@172.17.0.3:60366 for position 0 with pro= posal 2 I0906 20:02:07.295254 29073 leveldb.cpp:341] Persisting action (8 bytes) to= leveldb took 472361ns I0906 20:02:07.295285 29073 replica.cpp:708] Persisted action NOP at positi= on 0 I0906 20:02:07.296751 29076 replica.cpp:537] Replica received write request= for position 0 from __req_res__(6369)@172.17.0.3:60366 I0906 20:02:07.296835 29076 leveldb.cpp:436] Reading position from leveldb = took 39744ns I0906 20:02:07.297452 29076 leveldb.cpp:341] Persisting action (14 bytes) t= o leveldb took 554740ns I0906 20:02:07.297485 29076 replica.cpp:708] Persisted action NOP at positi= on 0 I0906 20:02:07.298262 29083 replica.cpp:691] Replica received learned notic= e for position 0 from @0.0.0.0:0 I0906 20:02:07.298765 29083 leveldb.cpp:341] Persisting action (16 bytes) t= o leveldb took 460819ns I0906 20:02:07.298796 29083 replica.cpp:708] Persisted action NOP at positi= on 0 @ 0x2ac83c44f56b mesos::internal::slave::Slave::detected() I0906 20:02:07.299576 29085 log.cpp:569] Writer started with ending positio= n 0 I0906 20:02:07.300812 29071 leveldb.cpp:436] Reading position from leveldb = took 31797ns I0906 20:02:07.301996 29073 registrar.cpp:365] Successfully fetched the reg= istry (0B) in 12.048896ms I0906 20:02:07.302140 29073 registrar.cpp:464] Applied 1 operations in 3292= 4ns; attempting to update the registry I0906 20:02:07.303042 29078 log.cpp:577] Attempting to append 168 bytes to = the log I0906 20:02:07.303190 29079 coordinator.cpp:348] Coordinator attempting to = write APPEND action at position 1 @ 0x2ac83c4a5d03 _ZZN7process8dispatchIN5mesos8internal5slave5Slave= ERKNS_6FutureI6OptionINS1_10MasterInfoEEEES9_EEvRKNS_3PIDIT_EEMSD_FvT0_ET1_= ENKUlPNS_11ProcessBaseEE_clESM_ I0906 20:02:07.304149 29076 replica.cpp:537] Replica received write request= for position 1 from __req_res__(6370)@172.17.0.3:60366 I0906 20:02:07.304754 29076 leveldb.cpp:341] Persisting action (187 bytes) = to leveldb took 546211ns I0906 20:02:07.304786 29076 replica.cpp:708] Persisted action APPEND at pos= ition 1 I0906 20:02:07.305613 29078 replica.cpp:691] Replica received learned notic= e for position 1 from @0.0.0.0:0 I0906 20:02:07.306145 29078 leveldb.cpp:341] Persisting action (189 bytes) = to leveldb took 490605ns I0906 20:02:07.306182 29078 replica.cpp:708] Persisted action APPEND at pos= ition 1 I0906 20:02:07.307394 29070 registrar.cpp:509] Successfully updated the reg= istry in 5.172736ms I0906 20:02:07.307579 29070 registrar.cpp:395] Successfully recovered regis= trar I0906 20:02:07.307659 29085 log.cpp:596] Attempting to truncate the log to = 1 I0906 20:02:07.307802 29073 coordinator.cpp:348] Coordinator attempting to = write TRUNCATE action at position 2 I0906 20:02:07.308280 29072 master.cpp:1659] Recovered 0 agents from the re= gistry (129B); allowing 10mins for agents to re-register I0906 20:02:07.308377 29085 hierarchical.cpp:176] Skipping recovery of hier= archical allocator: nothing to recover I0906 20:02:07.309029 29073 replica.cpp:537] Replica received write request= for position 2 from __req_res__(6371)@172.17.0.3:60366 I0906 20:02:07.309675 29073 leveldb.cpp:341] Persisting action (16 bytes) t= o leveldb took 528589ns I0906 20:02:07.309706 29073 replica.cpp:708] Persisted action TRUNCATE at p= osition 2 I0906 20:02:07.310412 29082 replica.cpp:691] Replica received learned notic= e for position 2 from @0.0.0.0:0 I0906 20:02:07.310714 29082 leveldb.cpp:341] Persisting action (18 bytes) t= o leveldb took 272545ns I0906 20:02:07.310772 29082 leveldb.cpp:399] Deleting ~1 keys from leveldb = took 33082ns I0906 20:02:07.310802 29082 replica.cpp:708] Persisted action TRUNCATE at p= osition 2 @ 0x2ac83c4d821e _ZNSt17_Function_handlerIFvPN7process11ProcessBase= EEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureI6OptionINS5_10M= asterInfoEEEESD_EEvRKNS0_3PIDIT_EEMSH_FvT0_ET1_EUlS2_E_E9_M_invokeERKSt9_An= y_dataS2_ @ 0x2ac83d085c43 std::function<>::operator()() @ 0x2ac83d068bcb process::ProcessBase::visit() @ 0x2ac83d070fe0 process::DispatchEvent::visit() @ 0xa196b2 process::ProcessBase::serve() @ 0x2ac83d064ec0 process::ProcessManager::resume() @ 0x2ac83d061b2d _ZZN7process14ProcessManager12init_threadsEvENKUt_= clEv @ 0x2ac83d070788 _ZNSt12_Bind_simpleIFZN7process14ProcessManager12i= nit_threadsEvEUt_vEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE @ 0x2ac83d0706df _ZNSt12_Bind_simpleIFZN7process14ProcessManager12i= nit_threadsEvEUt_vEEclEv @ 0x2ac83d070678 _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14P= rocessManager12init_threadsEvEUt_vEEE6_M_runEv @ 0x2ac83e9c0a60 (unknown) @ 0x2ac83f137184 start_thread @ 0x2ac83f44737d (unknown) make[4]: *** [check-local] Segmentation fault {code} It looks like the framework disconnects and the master shuts down premature= ly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)