Return-Path: X-Original-To: apmail-mesos-user-archive@www.apache.org Delivered-To: apmail-mesos-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 65BC818769 for ; Sun, 1 Nov 2015 09:40:57 +0000 (UTC) Received: (qmail 41326 invoked by uid 500); 1 Nov 2015 09:40:51 -0000 Delivered-To: apmail-mesos-user-archive@mesos.apache.org Received: (qmail 41259 invoked by uid 500); 1 Nov 2015 09:40:51 -0000 Mailing-List: contact user-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mesos.apache.org Delivered-To: mailing list user@mesos.apache.org Received: (qmail 41249 invoked by uid 99); 1 Nov 2015 09:40:51 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Nov 2015 09:40:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 0100518097E for ; Sun, 1 Nov 2015 09:40:51 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.902 X-Spam-Level: **** X-Spam-Status: No, score=4.902 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, KAM_BADIPHTTP=2, NORMAL_HTTP_TO_IP=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 6lfxiUi0YBTH for ; Sun, 1 Nov 2015 09:40:36 +0000 (UTC) Received: from mail-wi0-f182.google.com (mail-wi0-f182.google.com [209.85.212.182]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 4D47F203BD for ; Sun, 1 Nov 2015 09:40:35 +0000 (UTC) Received: by wijp11 with SMTP id p11so34731196wij.0 for ; Sun, 01 Nov 2015 01:40:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=krw4w35OVU2Dgtdor8Sx5Ybx6w1o5IyBnftSOKm/H/Y=; b=wFC2RnIIyoZwO0666+6z2n2jkW42IZRo12wHNQJqOup4yis++gdOc16Kx/Qk8fU2zf 2jUol6Yse9Op7MVvCm+qvsPN5zCWT5ELJlCg+3nVph1h6Tqs5y7flAomSgSG9RlFRNDS P+TBWYAJ1HKI6LP30XtGfoDENKpl0H/9FDHnKfmt+RqkqI7NbzpZIzlw72zKonwJS2US m1/1u9c7WKut3M5T1+al4Op2xCeS/ifv7vRevkKuXwoy5B96oWqCIurKYto9KPsGauAx ZcKCO2ySBtkuzY5LWS/ydlY2JBj//+ZXQQU+pddZLwdWJalRHGxXLEyimUCL4mNVk35/ Ao+A== MIME-Version: 1.0 X-Received: by 10.194.48.113 with SMTP id k17mr17240662wjn.62.1446370834001; Sun, 01 Nov 2015 01:40:34 -0800 (PST) Received: by 10.28.153.146 with HTTP; Sun, 1 Nov 2015 01:40:33 -0800 (PST) In-Reply-To: References: <0F1D6CD1-0DAC-470A-B664-92C307A66C8C@mesosphere.io> Date: Sun, 1 Nov 2015 17:40:33 +0800 Message-ID: Subject: Re: Can't start docker container when SSL_ENABLED is on. From: haosdent To: user Content-Type: multipart/alternative; boundary=047d7ba97b3ec638110523777341 --047d7ba97b3ec638110523777341 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable @Xiaodong I create a ticket to trace this https://issues.apache.org/jira/browse/MESOS-3815 and post a patch in it. Feel free to review and test it together. Thank you! On Sun, Nov 1, 2015 at 4:54 PM, haosdent wrote: > Hi, @Xiaodong I could reproduce your problem in my testing today. A > quickly workaround is adding environment variables when you launch slave. > > ``` > ./bin/mesos-slave.sh xxxx --containerizers=3Ddocker,mesos > --executor_environment_variables=3D'{"SSL_KEY_FILE": "/tmp/server.key", > "SSL_CERT_FILE": "/tmp/ssl.chain.crt", "SSL_ENABLED": "true"}'' > ``` > > As you see above, pass the ssl env to docker-executor through specifying > --executor_environment_variables when starting. So far it works well for > me. Anyway I would submit a patch later to fix the docker environment > variables passing. After that, you could launch slave without > executor_environment_variables flag. > > On Sat, Oct 31, 2015 at 2:56 PM, Tim Chen wrote: > >> Hi Xiaodong, >> >> If you follow the reviewboard you'll see that the fix is not correct, I >> believe Jojy will be posting a new patch. >> >> Tim >> >> On Fri, Oct 30, 2015 at 6:58 PM, Xiaodong Zhang >> wrote: >> >>> it is still not working! >>> >>> Only if I remove SSL_ENABLED from envs before I start the slave it work= s >>> well. >>> >>> I applied the patch in version 0.24.1. And rebuild it with `--enable-li= bevent >>> --enable-ssl` =E3=80=82 >>> >>> =E5=8F=91=E4=BB=B6=E4=BA=BA: Xiaodong Zhang >>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8831=E6=97=A5 =E6=98=9F=E6=9C= =9F=E5=85=AD =E4=B8=8A=E5=8D=887:45 >>> >>> =E8=87=B3: "user@mesos.apache.org" >>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLED i= s on. >>> >>> Thanks Jojy. >>> >>> I will patch this in version 0.24.1, and rebuild it. I will let you kno= w >>> if it work well after I finish testing. >>> >>> =E5=8F=91=E4=BB=B6=E4=BA=BA: Jojy Varghese >>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8831=E6=97=A5 =E6=98=9F=E6=9C= =9F=E5=85=AD =E4=B8=8A=E5=8D=8812:45 >>> =E8=87=B3: "user@mesos.apache.org" >>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLED i= s on. >>> >>> Thanks Xiaodong. >>> >>> Based on the hypothesis that the container process launched with >>> SSL_ENABLED in environment is the problem, I have created a patch >>> https://reviews.apache.org/r/39818/. This might be a quick and dirty >>> was to test the hypothesis. Would it be possible for you to test again >>> after applying the patch? >>> >>> -Jojy >>> >>> >>> >>> On Oct 30, 2015, at 8:29 AM, Xiaodong Zhang wrote: >>> >>> Thanks @Jojy >>> >>> >>> >>> Flags at startup: --appc_store_dir=3D"/tmp/mesos/store/appc" >>> --authenticatee=3D"crammd5" --cgroups_cpu_enable_pids_and_tids_count=3D= "false" >>> --cgroups_enable_cfs=3D"false" --cgroups_hierarchy=3D"/sys/fs/cgroup" >>> --cgroups_limit_swap=3D"false" --cgroups_root=3D"mesos" >>> --container_disk_watch_interval=3D"15secs" --containerizers=3D"docker,m= esos" >>> --credential=3D"/etc/mesos-slave-auth" --default_role=3D"*" >>> --disk_watch_interval=3D"1mins" --docker=3D"/usr/bin/docker" >>> --docker_kill_orphans=3D"true" --docker_remove_delay=3D"6hrs" >>> --docker_socket=3D"/var/run/docker.sock" --docker_stop_timeout=3D"0ns" >>> --enforce_container_disk_quota=3D"false" >>> --executor_registration_timeout=3D"1hrs" >>> --executor_shutdown_grace_period=3D"5secs" >>> --fetcher_cache_dir=3D"/tmp/mesos/fetch" --fetcher_cache_size=3D"2GB" >>> --frameworks_home=3D"" --gc_delay=3D"1weeks" --gc_disk_headroom=3D"0.1" >>> --hadoop_home=3D"" --help=3D"false" --initialize_driver_logging=3D"true= " >>> --isolation=3D"posix/cpu,posix/mem" --launcher_dir=3D"/usr/libexec/meso= s" >>> --log_dir=3D"/var/log/mesos" --logbufsecs=3D"0" --logging_level=3D"INFO= " >>> --master=3D" >>> zk://172.31.43.77:2181,172.31.44.2:2181,172.31.36.91:2181/mesos" >>> --oversubscribed_resources_interval=3D"15secs" --perf_duration=3D"10sec= s" >>> --perf_interval=3D"1mins" --port=3D"5051" --qos_correction_interval_min= =3D"0ns" >>> --quiet=3D"false" --recover=3D"reconnect" --recovery_timeout=3D"15mins" >>> --registration_backoff_factor=3D"1secs" >>> --resource_monitoring_interval=3D"1secs" --revocable_cpu_low_priority= =3D"true" >>> --sandbox_directory=3D"/mnt/mesos/sandbox" --strict=3D"true" >>> --switch_user=3D"true" --version=3D"false" --work_dir=3D"/tmp/mesos" >>> >>> =E5=8F=91=E4=BB=B6=E4=BA=BA: Jojy Varghese >>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8830=E6=97=A5 =E6=98=9F=E6=9C= =9F=E4=BA=94 =E4=B8=8B=E5=8D=8811:17 >>> =E8=87=B3: "user@mesos.apache.org" >>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLED i= s on. >>> >>> Hi Xiaodong >>> This might be because the executor inherits the SSL environment >>> variables of slave and thus expects SSL key password to launch. Could y= ou >>> please add the part of the slave logs that says "Flags at startup=E2=80= =9D so that >>> we can have more information? >>> >>> thanks >>> Jojy >>> >>> >>> On Oct 29, 2015, at 8:55 PM, Xiaodong Zhang wrote: >>> >>> Thanks a lot !~ @haosent >>> >>> =E5=8F=91=E4=BB=B6=E4=BA=BA: haosdent >>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8830=E6=97=A5 =E6=98=9F=E6=9C= =9F=E4=BA=94 =E4=B8=8A=E5=8D=8811:45 >>> =E8=87=B3: user >>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLED i= s on. >>> >>> Hi, @Xiaodong I interested in your problem. But recently days I don't >>> have enough time to try reproduce your problem. I think I could try to = dig >>> your problem at this Sunday and give you feedback. >>> >>> On Fri, Oct 30, 2015 at 11:30 AM, Xiaodong Zhang >>> wrote: >>> >>>> Anybody know about this? >>>> >>>> =E5=8F=91=E4=BB=B6=E4=BA=BA: Xiaodong Zhang >>>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F=E6= =9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=887:38 >>>> >>>> =E8=87=B3: "user@mesos.apache.org" >>>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLED = is on. >>>> >>>> I think it is easy to reproduce this error. >>>> >>>> Start master with env: >>>> >>>> SSL_SUPPORT_DOWNGRADE >>>> SSL_ENABLED >>>> SSL_KEY_FILE >>>> SSL_CERT_FILE >>>> >>>> Start slave with env: >>>> >>>> SSL_ENABLED >>>> SSL_KEY_FILE >>>> SSL_CERT_FILE >>>> LIBPROCESS_ADVERTISE_IP >>>> >>>> >>>> Then run a docker task via marathon. >>>> >>>> =E5=8F=91=E4=BB=B6=E4=BA=BA: Xiaodong Zhang >>>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F=E6= =9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=883:09 >>>> =E8=87=B3: "user@mesos.apache.org" >>>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLED = is on. >>>> >>>> So now, mesos task work well but docker task doesn=E2=80=99t. >>>> >>>> =E5=8F=91=E4=BB=B6=E4=BA=BA: Xiaodong Zhang >>>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F=E6= =9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=882:08 >>>> =E8=87=B3: "user@mesos.apache.org" >>>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLED = is on. >>>> >>>> I run a task by marathon: >>>> >>>> { >>>> "id": "basic-0", >>>> "cmd": "while [ true ] ; do echo 'Hello Marathon' ; sleep 5 ; done= ", >>>> "cpus": 0.1, >>>> "mem": 10.0, >>>> "instances": 1} >>>> >>>> >>>> It works well. >>>> >>>> <742629F2-78E8-43F2-9015-F3D22720826B.png> >>>> >>>> Docker task can pull image but can=E2=80=99t run as I mentioned. >>>> >>>> My docker version 1.5.0 >>>> >>>> =E5=8F=91=E4=BB=B6=E4=BA=BA: Tim Chen >>>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F=E6= =9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=881:48 >>>> =E8=87=B3: "user@mesos.apache.org" >>>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLED = is on. >>>> >>>> Does running a task without docker container (Mesos containerizer) >>>> works with ssl in your environment? >>>> >>>> Tim >>>> >>>> On Wed, Oct 28, 2015 at 10:19 PM, Xiaodong Zhang >>>> wrote: >>>> >>>>> Thanks a lot. I find the log file in slave. >>>>> >>>>> One of the task: >>>>> >>>>> Stdout: >>>>> >>>>> --container=3D"mesos-20151029-043755-3549436724-5050-5674-S0.e2c2580f= -8082-4f17-b0cc-4e32e040d444" >>>>> --docker=3D"/home/ubuntu/luna/bin/docker" --help=3D"false" >>>>> --initialize_driver_logging=3D"true" --logbufsecs=3D"0" --logging_lev= el=3D"INFO" >>>>> --mapped_directory=3D"/mnt/mesos/sandbox" --quiet=3D"false" >>>>> --sandbox_directory=3D"/tmp/mesos/slaves/20151029-043755-3549436724-5= 050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/= e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/r= uns/e2c2580f-8082-4f17-b0cc-4e32e040d444" >>>>> --stop_timeout=3D"0ns" >>>>> --container=3D"mesos-20151029-043755-3549436724-5050-5674-S0.e2c2580f= -8082-4f17-b0cc-4e32e040d444" >>>>> --docker=3D"/home/ubuntu/luna/bin/docker" --help=3D"false" >>>>> --initialize_driver_logging=3D"true" --logbufsecs=3D"0" --logging_lev= el=3D"INFO" >>>>> --mapped_directory=3D"/mnt/mesos/sandbox" --quiet=3D"false" >>>>> --sandbox_directory=3D"/tmp/mesos/slaves/20151029-043755-3549436724-5= 050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/= e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/r= uns/e2c2580f-8082-4f17-b0cc-4e32e040d444" >>>>> --stop_timeout=3D"0ns" >>>>> Shutting down >>>>> >>>>> Stderr: >>>>> >>>>> I1029 05:14:06.529364 27862 fetcher.cpp:414] Fetcher Info: >>>>> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/20151029-043755-3549= 436724-5050-5674-S0","items":[{"action":"BYPASS_CACHE","uri":{"extract":fal= se,"value":"file:\/\/\/etc\/.dockercfg"}}],"sandbox_directory":"\/tmp\/meso= s\/slaves\/20151029-043755-3549436724-5050-5674-S0\/frameworks\/20151029-04= 3755-3549436724-5050-5674-0000\/executors\/e4a3bed5-64e6-4970-8bb1-df640465= 6a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f\/runs\/e2c2580f-8082-4f17-b0cc-4e= 32e040d444"} >>>>> I1029 05:14:06.530562 27862 fetcher.cpp:369] Fetching URI ' >>>>> file:///etc/.dockercfg' >>>>> I1029 05:14:06.530580 27862 fetcher.cpp:243] Fetching directly into >>>>> the sandbox directory >>>>> I1029 05:14:06.530594 27862 fetcher.cpp:180] Fetching URI ' >>>>> file:///etc/.dockercfg' >>>>> I1029 05:14:06.530609 27862 fetcher.cpp:160] Copying resource with >>>>> command:cp '/etc/.dockercfg' >>>>> '/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks= /20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb= 1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17= -b0cc-4e32e040d444/.dockercfg' >>>>> I1029 05:14:06.532165 27862 fetcher.cpp:446] Fetched ' >>>>> file:///etc/.dockercfg' to >>>>> '/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks= /20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb= 1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17= -b0cc-4e32e040d444/.dockercfg' >>>>> I1029 05:14:07.782054 27955 exec.cpp:133] Version: 0.24.1 >>>>> I1029 05:14:07.785039 27963 exec.cpp:462] Slave exited ... shutting >>>>> down >>>>> E1029 05:14:07.785158 27964 socket.hpp:174] Shutdown failed on fd=3D7= : >>>>> Transport endpoint is not connected [107] >>>>> >>>>> =E5=8F=91=E4=BB=B6=E4=BA=BA: haosdent >>>>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>>>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F=E6= =9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=881:13 >>>>> >>>>> =E8=87=B3: user >>>>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLED= is on. >>>>> >>>>> <5185_02_04.png> >>>>> <5185_02_07.png> >>>>> =E2=80=8B >>>>> I capture how I find tasks log in my local webui, could you find the >>>>> stderr and stdout for your tasks according above screenshots? >>>>> =E2=80=8B >>>>> >>>>> On Thu, Oct 29, 2015 at 1:07 PM, Xiaodong Zhang >>>>> wrote: >>>>> >>>>>> I didn=E2=80=99t see some useful info. >>>>>> >>>>>> In mesos slave log, there is a line : >>>>>> I1029 03:29:53.160143 9292 slave.cpp:3399] Executor >>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240af= abf713' >>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 terminated >>>>>> with signal Killed >>>>>> >>>>>> I check the normal log, it shows: >>>>>> >>>>>> I1014 15:22:21.276007 23163 slave.cpp:3326] Executor >>>>>> 'ffc08dce-997f-41f7-9b03-57c1b4bc1f85.47ed02aa-7285-11e5-80d7-000d3a= 8033de' >>>>>> of framework 20150814-115157-1677721866-5050-6185-0000 exited with >>>>>> status 0 >>>>>> >>>>>> Is this helpful? >>>>>> >>>>>> =E5=8F=91=E4=BB=B6=E4=BA=BA: Xiaodong Zhang >>>>>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>>>>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F=E6= =9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=8812:59 >>>>>> =E8=87=B3: "user@mesos.apache.org" >>>>>> >>>>>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLE= D is on. >>>>>> >>>>>> <9D46724C-457C-4BE1-B0E4-F57B147F6DC8.png> >>>>>> >>>>>> The webui have a LOG link, when click it shows like this: >>>>>> >>>>>> I1029 04:44:32.293445 5697 http.cpp:321] HTTP GET for >>>>>> /master/state.json from 114.113.20.135:55682 with >>>>>> User-Agent=3D'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) >>>>>> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/53= 7.36' >>>>>> I1029 04:44:34.533504 5704 master.cpp:4613] Sending 1 offers to >>>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 >>>>>> I1029 04:44:34.539579 5702 master.cpp:2739] Processing ACCEPT call >>>>>> for offers: [ 20151029-043755-3549436724-5050-5674-O2 ] on slave >>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>> 50.112.136.148:5051 ( >>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) for framework >>>>>> 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 >>>>>> I1029 04:44:34.539710 5702 hierarchical.hpp:814] Recovered >>>>>> cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000] (total: >>>>>> cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000], allocat= ed: ) >>>>>> on slave 20151029-043755-3549436724-5050-5674-S0 from framework >>>>>> 20151029-043755-3549436724-5050-5674-0000 >>>>>> I1029 04:44:37.360901 5703 master.cpp:4294] Performing implicit tas= k >>>>>> state reconciliation for framework >>>>>> 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 >>>>>> I1029 04:44:40.539989 5704 master.cpp:4613] Sending 1 offers to >>>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 >>>>>> I1029 04:44:40.610321 5702 master.cpp:2739] Processing ACCEPT call >>>>>> for offers: [ 20151029-043755-3549436724-5050-5674-O3 ] on slave >>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>> 50.112.136.148:5051 ( >>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) for framework >>>>>> 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 >>>>>> I1029 04:44:40.610846 5702 master.hpp:170] Adding task >>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b49= 3b22f >>>>>> with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] on= slave >>>>>> 20151029-043755-3549436724-5050-5674-S0 ( >>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) >>>>>> I1029 04:44:40.610911 5702 master.cpp:3069] Launching task >>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b49= 3b22f >>>>>> of framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 >>>>>> with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] on= slave >>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>> 50.112.136.148:5051 ( >>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) >>>>>> I1029 04:44:40.611095 5702 hierarchical.hpp:814] Recovered >>>>>> cpus(*):0.9375; mem(*):743; disk(*):3962; ports(*):[31000-31863, >>>>>> 31865-32000] (total: cpus(*):1; mem(*):999; disk(*):3962; >>>>>> ports(*):[31000-32000], allocated: cpus(*):0.0625; mem(*):256; >>>>>> ports(*):[31864-31864]) on slave 20151029-043755-3549436724-5050-567= 4-S0 >>>>>> from framework 20151029-043755-3549436724-5050-5674-0000 >>>>>> I1029 04:44:43.324970 5698 http.cpp:321] HTTP GET for >>>>>> /master/state.json from 114.113.20.135:55682 with >>>>>> User-Agent=3D'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) >>>>>> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/53= 7.36' >>>>>> I1029 04:44:46.546671 5703 master.cpp:4613] Sending 1 offers to >>>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 >>>>>> I1029 04:44:46.557266 5699 master.cpp:2739] Processing ACCEPT call >>>>>> for offers: [ 20151029-043755-3549436724-5050-5674-O4 ] on slave >>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>> 50.112.136.148:5051 ( >>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) for framework >>>>>> 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 >>>>>> I1029 04:44:46.557394 5699 hierarchical.hpp:814] Recovered >>>>>> cpus(*):0.9375; mem(*):743; disk(*):3962; ports(*):[31000-31863, >>>>>> 31865-32000] (total: cpus(*):1; mem(*):999; disk(*):3962; >>>>>> ports(*):[31000-32000], allocated: cpus(*):0.0625; mem(*):256; >>>>>> ports(*):[31864-31864]) on slave 20151029-043755-3549436724-5050-567= 4-S0 >>>>>> from framework 20151029-043755-3549436724-5050-5674-0000 >>>>>> I1029 04:44:47.267562 5700 master.cpp:4069] Status update >>>>>> TASK_FAILED (UUID: 0ea607fc-bf24-4bda-b107-55a54aba31cf) for task >>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b49= 3b22f >>>>>> of framework 20151029-043755-3549436724-5050-5674-0000 from slave >>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>> 50.112.136.148:5051 ( >>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) >>>>>> I1029 04:44:47.267645 5700 master.cpp:4108] Forwarding status updat= e >>>>>> TASK_FAILED (UUID: 0ea607fc-bf24-4bda-b107-55a54aba31cf) for task >>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b49= 3b22f >>>>>> of framework 20151029-043755-3549436724-5050-5674-0000 >>>>>> I1029 04:44:47.267774 5700 master.cpp:5576] Updating the latest >>>>>> state of task >>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b49= 3b22f >>>>>> of framework 20151029-043755-3549436724-5050-5674-0000 to TASK_FAILE= D >>>>>> I1029 04:44:47.267907 5700 hierarchical.hpp:814] Recovered >>>>>> cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] (total: cpus(*):1= ; >>>>>> mem(*):999; disk(*):3962; ports(*):[31000-32000], allocated: ) on sl= ave >>>>>> 20151029-043755-3549436724-5050-5674-S0 from framework >>>>>> 20151029-043755-3549436724-5050-5674-0000 >>>>>> I1029 04:44:47.289356 5698 master.cpp:5644] Removing task >>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b49= 3b22f >>>>>> with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] of >>>>>> framework 20151029-043755-3549436724-5050-5674-0000 on slave >>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>> 50.112.136.148:5051 ( >>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) >>>>>> I1029 04:44:47.289459 5698 master.cpp:3398] Processing ACKNOWLEDGE >>>>>> call 0ea607fc-bf24-4bda-b107-55a54aba31cf for task >>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b49= 3b22f >>>>>> of framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 on >>>>>> slave 20151029-043755-3549436724-5050-5674-S0 >>>>>> >>>>>> >>>>>> >>>>>> =E5=8F=91=E4=BB=B6=E4=BA=BA: haosdent >>>>>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>>>>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F=E6= =9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=8812:02 >>>>>> =E8=87=B3: user >>>>>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABLE= D is on. >>>>>> >>>>>> Oh, I mean you task logs. They could be get from Mesos webui. >>>>>> >>>>>> On Thu, Oct 29, 2015 at 11:52 AM, Xiaodong Zhang >>>>>> wrote: >>>>>> >>>>>>> Thanks for your reply. >>>>>>> >>>>>>> Yes I build mesos with `--enable-libevent --enable-ssl`. If I don= =E2=80=99t >>>>>>> provide key and pem when start slave, it will register fail(That me= ans the >>>>>>> ssl work well right?) >>>>>>> >>>>>>> As I said the odd thing is the container nerver run(`docker ps =E2= =80=93a >>>>>>> show nothing`). So it can=E2=80=99t have any stdout or stderr. >>>>>>> >>>>>>> =E5=8F=91=E4=BB=B6=E4=BA=BA: haosdent >>>>>>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>>>>>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F= =E6=9C=9F=E5=9B=9B =E4=B8=8A=E5=8D=8811:47 >>>>>>> =E8=87=B3: user >>>>>>> =E4=B8=BB=E9=A2=98: Re: Can't start docker container when SSL_ENABL= ED is on. >>>>>>> >>>>>>> Do you compile mesos with ssl support? The default compile don't >>>>>>> contains ssl. And does docker container have stdour and stderr? >>>>>>> >>>>>>> On Thu, Oct 29, 2015 at 11:41 AM, Xiaodong Zhang >>>>>>> wrote: >>>>>>> >>>>>>>> My scenarios is like previous email says, masters and slaves are i= n >>>>>>>> different IaaS. Now the slaves can register to the masters with SS= L_ENABLED >>>>>>>> is on . >>>>>>>> >>>>>>>> But I meet another problem. Slaves can=E2=80=99t run container(the= odd >>>>>>>> thing is they can pull image successfully,just can not run contain= er, >>>>>>>> `docker ps =E2=80=93a ` list nothing) >>>>>>>> >>>>>>>> The logs like this: >>>>>>>> >>>>>>>> I1029 03:29:45.967741 9288 docker.cpp:758] Starting container >>>>>>>> 'd4f4e236-0d0a-492c-86df-eef48a414e23' for task >>>>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240= afabf713' >>>>>>>> (and executor >>>>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240= afabf713') >>>>>>>> of framework '20151029-031549-1294671788-5050-4937-0000' >>>>>>>> I1029 03:29:48.044148 9292 docker.cpp:382] Checkpointing pid 1206= 2 >>>>>>>> to >>>>>>>> '/tmp/mesos/meta/slaves/20151029-031549-1294671788-5050-4937-S0/fr= ameworks/20151029-031549-1294671788-5050-4937-0000/executors/279bcb34-f705-= 4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713/runs/d4f4e236-0= d0a-492c-86df-eef48a414e23/pids/forked.pid' >>>>>>>> I1029 03:29:53.159361 9292 docker.cpp:1576] Executor for containe= r >>>>>>>> 'd4f4e236-0d0a-492c-86df-eef48a414e23' has exited >>>>>>>> I1029 03:29:53.159572 9292 docker.cpp:1374] Destroying container >>>>>>>> 'd4f4e236-0d0a-492c-86df-eef48a414e23' >>>>>>>> I1029 03:29:53.159822 9292 docker.cpp:1478] Running docker stop o= n >>>>>>>> container 'd4f4e236-0d0a-492c-86df-eef48a414e23' >>>>>>>> I1029 03:29:53.160143 9292 slave.cpp:3399] Executor >>>>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240= afabf713' >>>>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 terminated >>>>>>>> with signal Killed >>>>>>>> I1029 03:29:53.160884 9292 slave.cpp:2696] Handling status update >>>>>>>> TASK_FAILED (UUID: 27a2080a-8807-449e-9077-837ec45b4c51) for task >>>>>>>> 279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240a= fabf713 >>>>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 from @ >>>>>>>> 0.0.0.0:0 >>>>>>>> W1029 03:29:53.161247 9288 docker.cpp:986] Ignoring updating >>>>>>>> unknown container: d4f4e236-0d0a-492c-86df-eef48a414e23 >>>>>>>> I1029 03:29:53.161548 9293 status_update_manager.cpp:322] Receive= d >>>>>>>> status update TASK_FAILED (UUID: 27a2080a-8807-449e-9077-837ec45b4= c51) for >>>>>>>> task >>>>>>>> 279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240a= fabf713 >>>>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 >>>>>>>> >>>>>>>> I run master node with env: >>>>>>>> >>>>>>>> SSL_SUPPORT_DOWNGRADE=3Dtrue >>>>>>>> SSL_ENABLED=3Dtrue >>>>>>>> SSL_KEY_FILE=3D/home/ubuntu/xx.key >>>>>>>> SSL_CERT_FILE=3D/home/ubuntu/xx.pem >>>>>>>> >>>>>>>> Slave node with env: >>>>>>>> >>>>>>>> SSL_ENABLED=3Dtrue >>>>>>>> SSL_KEY_FILE=3D/home/ubuntu/xx.key >>>>>>>> SSL_CERT_FILE=3D/home/ubuntu/xx.pem >>>>>>>> LIBPROCESS_ADVERTISE_IP=3Dxxx.xxx.xxx.xxx >>>>>>>> >>>>>>>> When I remove all SSL envs. Slaves work well. >>>>>>>> >>>>>>>> Did I miss sth? >>>>>>>> >>>>>>>> Version: >>>>>>>> >>>>>>>> Mesos 0.24.1 >>>>>>>> Maraton 0.9.2 >>>>>>>> >>>>>>>> OS >>>>>>>> ubuntu 14.04 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> =E5=8F=91=E4=BB=B6=E4=BA=BA: Anindya Sinha >>>>>>>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>>>>>>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8828=E6=97=A5 =E6=98=9F= =E6=9C=9F=E4=B8=89 =E4=B8=8B=E5=8D=882:32 >>>>>>>> =E8=87=B3: "user@mesos.apache.org" >>>>>>>> =E4=B8=BB=E9=A2=98: Re: How to tell master which ip to connect. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Oct 27, 2015 at 7:43 PM, Xiaodong Zhang >>>>>>>> wrote: >>>>>>>> >>>>>>>>> It works! Thanks a lot. >>>>>>>>> >>>>>>>> >>>>>>>> Ok. So we should expose advertise_ip and advertise_port as command >>>>>>>> line options for mesos-slave as well (instead of using the environ= ment >>>>>>>> variables)? Opened https://issues.apache.org/jira/browse/MESOS-380= 9 >>>>>>>> . >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Another question. Do masters and slaves communicate each other vi= a >>>>>>>>> a safety way=EF=BC=9FIs the data encrypted? I want to make sure d= eploy masters and >>>>>>>>> slaves into different IaaS is PROD-READY. >>>>>>>>> >>>>>>>>> =E5=8F=91=E4=BB=B6=E4=BA=BA: haosdent >>>>>>>>> =E7=AD=94=E5=A4=8D: "user@mesos.apache.org" >>>>>>>>> =E6=97=A5=E6=9C=9F: 2015=E5=B9=B410=E6=9C=8828=E6=97=A5 =E6=98=9F= =E6=9C=9F=E4=B8=89 =E4=B8=8A=E5=8D=8810:23 >>>>>>>>> =E8=87=B3: user >>>>>>>>> =E4=B8=BB=E9=A2=98: Re: How to tell master which ip to connect. >>>>>>>>> >>>>>>>>> Do you try `export LIBPROCESS_ADVERTISE_IP=3Dxxx` and >>>>>>>>> `LIBPROCESS_ADVERTISE_PORT` when start slave? >>>>>>>>> >>>>>>>>> On Wed, Oct 28, 2015 at 10:16 AM, Xiaodong Zhang < >>>>>>>>> xdzhang@alauda.io> wrote: >>>>>>>>> >>>>>>>>>> Hi teams: >>>>>>>>>> >>>>>>>>>> My scenarios is like this: >>>>>>>>>> >>>>>>>>>> My master nodes were deployed in AWS. My slaves were in AZURE.So >>>>>>>>>> they communicate via public ip. >>>>>>>>>> I got trouble when slaves try to register to master. >>>>>>>>>> Now slaves can get master=E2=80=99s public ip address,and can se= nd >>>>>>>>>> register request.But they can only send there private ip to mast= er.(Because >>>>>>>>>> they don=E2=80=99t know there public ip,thus they can=E2=80=99t = not bind a public ip via >>>>>>>>>> =E2=80=94ip flag), thus masters can=E2=80=99t connect slaves.Ho= w can the slave to tell >>>>>>>>>> master which ip master should connect(I can=E2=80=99t find any f= lags like =E2=80=94advertise_ip >>>>>>>>>> in master). >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Regards, >>>>>>>>> Haosdent Huang >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> Haosdent Huang >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> Haosdent Huang >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> Haosdent Huang >>>>> >>>> >>>> >>> >>> >>> -- >>> Best Regards, >>> Haosdent Huang >>> <5185_02_07.png><9D46724C-457C-4BE1-B0E4-F57B147F6DC8.png> >>> <742629F2-78E8-43F2-9015-F3D22720826B.png><5185_02_04.png> >>> >>> >>> >>> >> > > > -- > Best Regards, > Haosdent Huang > --=20 Best Regards, Haosdent Huang --047d7ba97b3ec638110523777341 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
@Xiaodong I create a ticket to trace this https://issues.apache.org/jira= /browse/MESOS-3815 and post a patch in it. Feel free to review and test= it together. Thank you!

On Sun, Nov 1, 2015 at 4:54 PM, haosdent <= ;haosdent@gmail.com= > wrote:
H= i, @Xiaodong I could reproduce your problem in my testing today. A quickly = workaround is adding environment variables when you launch slave.

<= /div>
```
./bin/mesos-slave.sh xxxx --containerizers=3Ddocker= ,mesos --executor_environment_variables=3D'{"SSL_KEY_FILE": &= quot;/tmp/server.key", "SSL_CERT_FILE": "/tmp/ssl.chain= .crt", "SSL_ENABLED": "true"}''
<= div>```

As you see above, pass the ssl env to dock= er-executor through specifying --executor_environment_variables when starti= ng. So far it works well for me. Anyway I would submit a patch later to fix= the docker environment variables passing. After that, you could launch sla= ve without executor_environment_variables flag.

On Sat, Oct 31, 2015 at 2:56 PM, Tim Chen <tim@mesosphere.io>= ; wrote:
Hi Xiaod= ong,

If you follow the reviewboard you'll see that t= he fix is not correct, I believe Jojy will be posting a new patch.

Tim

On Fri= , Oct 30, 2015 at 6:58 PM, Xiaodong Zhang <xdzhang@alauda.io> wrote:
it is still not working!

Only if I remove SSL_ENABLED from envs before I start the slave it wor= ks well.

I applied the patch in version 0.24.1. And rebuild it with `--enable-l= ibevent --enable-ssl` =E3=80=82

=E5=8F=91=E4=BB=B6=E4=BA=BA: Xiaodo= ng Zhang <xdzhang= @alauda.io>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8831=E6=97=A5 =E6=98=9F=E6=9C=9F=E5=85=AD =E4=B8=8A=E5=8D=887:45

=E8=87=B3: "user@mesos.apache.org" &l= t;user@mesos.apa= che.org>
=E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

Thanks Jojy.

I will patch this in version 0.24.1, and rebuild it. I will let you kn= ow if it work well after I finish testing.

=E5=8F=91=E4=BB=B6=E4=BA=BA: Jojy V= arghese <jojy@me= sosphere.io>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8831=E6=97=A5 =E6=98=9F=E6=9C=9F=E5=85=AD =E4=B8=8A=E5=8D=8812:45 =E8=87=B3: "user@mesos.apache.org" &l= t;user@mesos.apa= che.org>
=E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

Thanks Xiaodong.=C2=A0

Based on the hypothesis that the container process launched with SSL_E= NABLED in environment is the problem, I have created a patch=C2=A0https://reviews.apache.org/r/39818/. =C2=A0This might be a quick and dirty was to test the hypothesis. Would it= be possible for you to test again after applying the patch?

-Jojy



On Oct 30, 2015, at 8:29 AM, Xiaodong Zhang <xdzhang@alauda.io> wrote:

Thanks @Jojy



Flags at startup: --appc_store_dir=3D"/tmp/mesos/store/appc"= --authenticatee=3D"crammd5" --cgroups_cpu_enable_pids_and_tids_c= ount=3D"false" --cgroups_enable_cfs=3D"false" --cgroups= _hierarchy=3D"/sys/fs/cgroup" --cgroups_limit_swap=3D"false&= quot; --cgroups_root=3D"mesos" --container_disk_watch_interval=3D"15secs" --containerizers=3D&q= uot;docker,mesos" --credential=3D"/etc/mesos-slave-auth" --d= efault_role=3D"*" --disk_watch_interval=3D"1mins" --doc= ker=3D"/usr/bin/docker" --docker_kill_orphans=3D"true" = --docker_remove_delay=3D"6hrs" --docker_socket=3D"/var/run/d= ocker.sock" --docker_stop_timeout=3D"0ns" --enforce_container_disk_quota=3D&= quot;false" --executor_registration_timeout=3D"1hrs" --execu= tor_shutdown_grace_period=3D"5secs" --fetcher_cache_dir=3D"/= tmp/mesos/fetch" --fetcher_cache_size=3D"2GB" --frameworks_h= ome=3D"" --gc_delay=3D"1weeks" --gc_disk_headroom=3D&qu= ot;0.1" --hadoop_home=3D"" --help=3D"false" --initialize_drive= r_logging=3D"true" --isolation=3D"posix/cpu,posix/mem" = --launcher_dir=3D"/usr/libexec/mesos" --log_dir=3D"/var/log/= mesos" --logbufsecs=3D"0" --logging_level=3D"INFO"= --master=3D"zk://172.31.43.77:2181,172.31.44.2:2181,172.31.36.91:2= 181/mesos" --oversubscribed_resources_interval=3D"15secs" --perf_duration= =3D"10secs" --perf_interval=3D"1mins" --port=3D"50= 51" --qos_correction_interval_min=3D"0ns" --quiet=3D"fa= lse" --recover=3D"reconnect" --recovery_timeout=3D"15mi= ns" --registration_backoff_factor=3D"1secs" --resource_monit= oring_interval=3D"1secs" --revocable_cpu_low_priority=3D"true" --sandbox_directory=3D&quo= t;/mnt/mesos/sandbox" --strict=3D"true" --switch_user=3D&quo= t;true" --version=3D"false" --work_dir=3D"/tmp/mesos&qu= ot;

=E5=8F=91=E4=BB=B6=E4=BA=BA: Jojy V= arghese <jojy@me= sosphere.io>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8830=E6=97=A5 =E6=98=9F=E6=9C=9F=E4=BA=94 =E4=B8=8B=E5=8D=8811:17 =E8=87=B3: "user@mesos.apache.org" &l= t;user@mesos.apa= che.org>
=E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

Hi Xiaodong
=C2=A0 This might be because the executor inherits the SSL environment= variables of slave and thus expects SSL key password to launch. Could you = please add the part of the slave logs that says "Flags at startup=E2= =80=9D so that we can have more information?

thanks
Jojy


On Oct 29, 2015, at 8:55 PM, Xiaodong Zhang <xdzhang@alauda.io> wrote:

Thanks a lot !~ @haosent

=E5=8F=91=E4=BB=B6=E4=BA=BA: haosde= nt <haosdent@gma= il.com>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8830=E6=97=A5 =E6=98=9F=E6=9C=9F=E4=BA=94 =E4=B8=8A=E5=8D=8811:45 =E8=87=B3: user <user@mesos.apache.org> =E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

Hi, @Xiaodong I interested in your problem. But recently d= ays I don't have enough time to try reproduce your problem. I think I c= ould try to dig your problem at this Sunday and give you feedback.

On Fri, Oct 30, 2015 at 11:30 AM, Xiaodong Zhang= <xdzhang@alauda.i= o> wrote:
Anybody know about this?

=E5=8F=91=E4=BB=B6=E4=BA=BA: = Xiaodong Zhang <x= dzhang@alauda.io>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5= =B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F=E6=9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=88= 7:38

=E8=87=B3: "user@mesos.apache.org" &l= t;user@mesos.apa= che.org>
=E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

I think it is easy to=C2=A0reproduce this error.

Start master with env:=C2=A0

SSL_SUPPORT_DOWNGRADE
SSL_ENABLED
SSL_KEY_FILE
SSL_CERT_FILE

Start slave with env:

SSL_ENABLED
SSL_KEY_FILE
SSL_CERT_FILE
LIBPROCESS_ADVERTISE_IP


Then run a docker task via marathon.

=E5=8F=91=E4=BB=B6=E4=BA=BA: Xiaodo= ng Zhang <xdzhang= @alauda.io>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8829=E6=97=A5 =E6=98=9F=E6=9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=883:09
=E8=87=B3: "user@mesos.apache.org" &l= t;user@mesos.apa= che.org>
=E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

So now, mesos task work well but docker task doesn=E2=80=99t.=C2=A0

=E5=8F=91=E4=BB=B6=E4=BA=BA: Xiaodo= ng Zhang <xdzhang= @alauda.io>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8829=E6=97=A5 =E6=98=9F=E6=9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=882:08
=E8=87=B3: "user@mesos.apache.org" &l= t;user@mesos.apa= che.org>
=E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

I run a task by marathon:

{
    "id": "basic-0",=20
    "cmd": "while [ true ] ; do echo 'Hello Mar=
athon' ; sleep 5 ; done",
    "cpus": 0.1,
    "mem": 10.0,
    "instances": <=
span style=3D"color:rgb(0,153,153)">1}

It works well.

<742629F2-78E8-43F2-9015-F3D22720826B.png>

Docker task can pull image but can=E2=80=99t run as I mentioned.

My docker version 1.5.0

=E5=8F=91=E4=BB=B6=E4=BA=BA: Tim Ch= en <tim@mesospher= e.io>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8829=E6=97=A5 =E6=98=9F=E6=9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=881:48
=E8=87=B3: "user@mesos.apache.org" &l= t;user@mesos.apa= che.org>
=E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

Does running a task without docker container (Mesos contai= nerizer) works with ssl in your environment?

Tim

On Wed, Oct 28, 2015 at 10:19 PM, Xiaodong Zhang= <xdzhang@alauda.i= o> wrote:
Thanks a lot. I find the log file in slave.

One of the task:

Stdout:

--container=3D"mesos-20151029-043755-3549436724-5050-5674-S0.e2c2= 580f-8082-4f17-b0cc-4e32e040d444" --docker=3D"/home/ubuntu/luna/b= in/docker" --help=3D"false" --initialize_driver_logging=3D&q= uot;true" --logbufsecs=3D"0" --logging_level=3D"INFO&qu= ot; --mapped_directory=3D"/mnt/mesos/sandbox" --quiet=3D"false" --sandbox_directory=3D"/tmp/mesos/slaves/= 20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-35494367= 24-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7= dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444" --stop_timeout=3D"0ns"
--container=3D"mesos-20151029-043755-3549436724-5050-5674-S0.e2c2= 580f-8082-4f17-b0cc-4e32e040d444" --docker=3D"/home/ubuntu/luna/b= in/docker" --help=3D"false" --initialize_driver_logging=3D&q= uot;true" --logbufsecs=3D"0" --logging_level=3D"INFO&qu= ot; --mapped_directory=3D"/mnt/mesos/sandbox" --quiet=3D"false" --sandbox_directory=3D"/tmp/mesos/slaves/= 20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-35494367= 24-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7= dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444" --stop_timeout=3D"0ns"
Shutting down

Stderr:

I1029 05:14:06.529364 27862 fetcher.cpp:414] Fetcher Info: {"cach= e_directory":"\/tmp\/mesos\/fetch\/slaves\/20151029-043755-354943= 6724-5050-5674-S0","items":[{"action":"BYPASS= _CACHE","uri":{"extract":false,"value":&= quot;file:\/\/\/etc\/.dockercfg"}}],"sandbox_directory":&quo= t;\/tmp\/mesos\/slaves\/20151029-043755-3549436724-5050-5674-S0\/frameworks= \/20151029-043755-3549436724-5050-5674-0000\/executors\/e4a3bed5-64e6-4970-= 8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f\/runs\/e2c2580f-8082= -4f17-b0cc-4e32e040d444"}
I1029 05:14:06.530562 27862 fetcher.cpp:369] Fetching URI 'file= :///etc/.dockercfg'
I1029 05:14:06.530580 27862 fetcher.cpp:243] Fetching directly into th= e sandbox directory
I1029 05:14:06.530594 27862 fetcher.cpp:180] Fetching URI 'file= :///etc/.dockercfg'
I1029 05:14:06.530609 27862 fetcher.cpp:160] Copying resource with com= mand:cp '/etc/.dockercfg' '/tmp/mesos/slaves/20151029-043755-35= 49436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/= executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247= b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444/.dockercfg'
I1029 05:14:06.532165 27862 fetcher.cpp:446] Fetched 'file:///e= tc/.dockercfg' to '/tmp/mesos/slaves/20151029-043755-3549436724= -5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executor= s/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f= /runs/e2c2580f-8082-4f17-b0cc-4e32e040d444/.dockercfg'
I1029 05:14:07.782054 27955 exec.cpp:133] Version: 0.24.1
I1029 05:14:07.785039 27963 exec.cpp:462] Slave exited ... shutting do= wn
E1029 05:14:07.785158 27964 socket.hpp:174] Shutdown failed on fd=3D7:= Transport endpoint is not connected [107]

=E5=8F=91=E4=BB=B6=E4=BA=BA: = haosdent <haosde= nt@gmail.com>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5= =B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F=E6=9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=88= 1:13

=E8=87=B3: user <user@mesos.apache.org> =E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

<5185_02_04.png>
<5185_02_07.png>
=E2=80=8B
I capture how I find tasks log in my local webui, could you find the s= tderr and stdout for your tasks according above screenshots?
=E2=80=8B

On Thu, Oct 29, 2015 at 1:07 PM, Xiaodong Zhang = <xdzhang@alauda.i= o> wrote:
I didn=E2=80=99t see some useful info.=C2=A0

In mesos slave log, there is a line :
I1029 03:29:53.160143 =C2=A09292 slave.cpp:3399] Executor '279bcb3= 4-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' of = framework 20151029-031549-1294671788-5050-4937-0000=C2=A0terminated with signal Killed=C2=A0

I check the normal log, it shows:

I1014 15:22:21.276007 23163 slave.cpp:3326] Executor 'ffc08dce-997= f-41f7-9b03-57c1b4bc1f85.47ed02aa-7285-11e5-80d7-000d3a8033de' of frame= work 20150814-115157-1677721866-5050-6185-0000 exited with status 0

Is this helpful?

=E5=8F=91=E4=BB=B6=E4=BA=BA: Xiaodo= ng Zhang <xdzhang= @alauda.io>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8829=E6=97=A5 =E6=98=9F=E6=9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=8812:59 =E8=87=B3: "user@mesos.apache.org" &l= t;user@mesos.apa= che.org>

=E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

<9D46724C-457C-4BE1-B0E4-F57B147F6DC8.png>

The webui have a LOG link, when click it shows like this:

I1029 04:44:32.293445 =C2=A05697 http.cpp:321] HTTP GET for /master/st= ate.json from 114.113.20.135:5= 5682 with User-Agent=3D'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_1= 0_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.= 36'
I1029 04:44:34.533504 =C2=A05704 master.cpp:4613] Sending 1 offers to = framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373
I1029 04:44:34.539579 =C2=A05702 master.cpp:2739] Processing ACCEPT ca= ll for offers: [ 20151029-043755-3549436724-5050-5674-O2 ] on slave 2015102= 9-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 (ec2-50-112-136-148.us-west-2.compute.amazonaws.com) for= framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373
I1029 04:44:34.539710 =C2=A05702 hierarchical.hpp:814] Recovered cpus(= *):1; mem(*):999; disk(*):3962; ports(*):[31000-32000] (total: cpus(*):1; m= em(*):999; disk(*):3962; ports(*):[31000-32000], allocated: ) on slave 2015= 1029-043755-3549436724-5050-5674-S0 from framework 20151029-043755-3549436724-5050-5674-0000
I1029 04:44:37.360901 =C2=A05703 master.cpp:4294] Performing implicit = task state reconciliation for framework 20151029-043755-3549436724-5050-567= 4-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373
I1029 04:44:40.539989 =C2=A05704 master.cpp:4613] Sending 1 offers to = framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373
I1029 04:44:40.610321 =C2=A05702 master.cpp:2739] Processing ACCEPT ca= ll for offers: [ 20151029-043755-3549436724-5050-5674-O3 ] on slave 2015102= 9-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 (ec2-50-112-136-148.us-west-2.compute.amazonaws.com) for= framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373
I1029 04:44:40.610846 =C2=A05702 master.hpp:170] Adding task e4a3bed5-= 64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f with resou= rces cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] on slave 20151029-0= 43755-3549436724-5050-5674-S0 (ec2-50-112-136-148.us-west-2.compute.amazonaws.com)
I1029 04:44:40.610911 =C2=A05702 master.cpp:3069] Launching task e4a3b= ed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f of fra= mework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 with = resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] on slave 20151= 029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 (ec2-50-112-136-148.us-west-2.compute.amazonaws.com)
I1029 04:44:40.611095 =C2=A05702 hierarchical.hpp:814] Recovered cpus(= *):0.9375; mem(*):743; disk(*):3962; ports(*):[31000-31863, 31865-32000] (t= otal: cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000], allocate= d: cpus(*):0.0625; mem(*):256; ports(*):[31864-31864]) on slave 20151029-043755-3549436724-5050-5674-S0 f= rom framework 20151029-043755-3549436724-5050-5674-0000
I1029 04:44:43.324970 =C2=A05698 http.cpp:321] HTTP GET for /master/st= ate.json from 114.113.20.135:5= 5682 with User-Agent=3D'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_1= 0_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.= 36'
I1029 04:44:46.546671 =C2=A05703 master.cpp:4613] Sending 1 offers to = framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373
I1029 04:44:46.557266 =C2=A05699 master.cpp:2739] Processing ACCEPT ca= ll for offers: [ 20151029-043755-3549436724-5050-5674-O4 ] on slave 2015102= 9-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 (ec2-50-112-136-148.us-west-2.compute.amazonaws.com) for= framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373
I1029 04:44:46.557394 =C2=A05699 hierarchical.hpp:814] Recovered cpus(= *):0.9375; mem(*):743; disk(*):3962; ports(*):[31000-31863, 31865-32000] (t= otal: cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000], allocate= d: cpus(*):0.0625; mem(*):256; ports(*):[31864-31864]) on slave 20151029-043755-3549436724-5050-5674-S0 f= rom framework 20151029-043755-3549436724-5050-5674-0000
I1029 04:44:47.267562 =C2=A05700 master.cpp:4069] Status update TASK_F= AILED (UUID: 0ea607fc-bf24-4bda-b107-55a54aba31cf) for task e4a3bed5-64e6-4= 970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f of framework 201= 51029-043755-3549436724-5050-5674-0000 from slave 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 (ec2-50-112-136-148.us-west-2.compute.amazonaws.com)
I1029 04:44:47.267645 =C2=A05700 master.cpp:4108] Forwarding status up= date TASK_FAILED (UUID: 0ea607fc-bf24-4bda-b107-55a54aba31cf) for task e4a3= bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f of fr= amework 20151029-043755-3549436724-5050-5674-0000
I1029 04:44:47.267774 =C2=A05700 master.cpp:5576] Updating the latest = state of task e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-= 0247b493b22f of framework 20151029-043755-3549436724-5050-5674-0000 to TASK= _FAILED
I1029 04:44:47.267907 =C2=A05700 hierarchical.hpp:814] Recovered cpus(= *):0.0625; mem(*):256; ports(*):[31864-31864] (total: cpus(*):1; mem(*):999= ; disk(*):3962; ports(*):[31000-32000], allocated: ) on slave 20151029-0437= 55-3549436724-5050-5674-S0 from framework 20151029-043755-3549436724-5050-5674-0000
I1029 04:44:47.289356 =C2=A05698 master.cpp:5644] Removing task e4a3be= d5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f with re= sources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] of framework 201= 51029-043755-3549436724-5050-5674-0000 on slave 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 (ec2-50-112-136-148.us-west-2.compute.amazonaws.com)
I1029 04:44:47.289459 =C2=A05698 master.cpp:3398] Processing ACKNOWLED= GE call 0ea607fc-bf24-4bda-b107-55a54aba31cf for task e4a3bed5-64e6-4970-8b= b1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f of framework 20151029-= 043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77:53373 on sl= ave 20151029-043755-3549436724-5050-5674-S0



=E5=8F=91=E4=BB=B6=E4=BA=BA: haosde= nt <haosdent@gma= il.com>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8829=E6=97=A5 =E6=98=9F=E6=9C=9F=E5=9B=9B =E4=B8=8B=E5=8D=8812:02 =E8=87=B3: user <user@mesos.apache.org> =E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

Oh, I mean you task logs. They could be get from Mesos web= ui.=C2=A0

On Thu, Oct 29, 2015 at 11:52 AM, Xiaodong Zhang= <xdzhang@alauda.i= o> wrote:
Thanks for your reply.

Yes I build mesos with `--enable-libevent --enable-ssl`. If I d= on=E2=80=99t provide key and pem when start slave, it will register fail(Th= at means the ssl work well right?)

As I said the odd thing is the container nerver run(`docker ps =E2=80= =93a show nothing`). So it can=E2=80=99t have any stdout or stderr.

=E5=8F=91=E4=BB=B6=E4=BA=BA: = haosdent <haosde= nt@gmail.com>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5= =B9=B410=E6=9C=8829=E6=97=A5 =E6=98=9F=E6=9C=9F=E5=9B=9B =E4=B8=8A=E5=8D=88= 11:47
=E8=87=B3: user <user@mesos.apache.org> =E4=B8=BB=E9=A2=98: Re: Can't s= tart docker container when SSL_ENABLED is on.

Do you compile mesos with ssl support? The default compile= don't contains ssl. And does docker container have stdour and stderr?<= /div>

On Thu, Oct 29, 2015 at 11:41 AM, Xiaodong Zhang= <xdzhang@alauda.i= o> wrote:
My scenarios is like previous email says, masters and slaves are in di= fferent IaaS. Now the slaves can register to the masters with=C2=A0SSL_ENAB= LED is on .

But I meet another problem. Slaves can=E2=80=99t run container(the odd= thing is they can pull image successfully,just can not run container, `doc= ker ps =E2=80=93a ` list nothing)

The logs like this:

I1029 03:29:45.967741 =C2=A09288 docker.cpp:758] Starting container &#= 39;d4f4e236-0d0a-492c-86df-eef48a414e23' for task '279bcb34-f705-48= 57-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' (and executo= r '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afa= bf713') of framework '20151029-031549-1294671788-5050-4937-0000'
I1029 03:29:48.044148 =C2=A09292 docker.cpp:382] Checkpointing pid 120= 62 to '/tmp/mesos/meta/slaves/20151029-031549-1294671788-5050-4937-S0/f= rameworks/20151029-031549-1294671788-5050-4937-0000/executors/279bcb34-f705= -4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713/runs/d4f4e236-= 0d0a-492c-86df-eef48a414e23/pids/forked.pid'
I1029 03:29:53.159361 =C2=A09292 docker.cpp:1576] Executor for contain= er 'd4f4e236-0d0a-492c-86df-eef48a414e23' has exited
I1029 03:29:53.159572 =C2=A09292 docker.cpp:1374] Destroying container= 'd4f4e236-0d0a-492c-86df-eef48a414e23'
I1029 03:29:53.159822 =C2=A09292 docker.cpp:1478] Running docker stop = on container 'd4f4e236-0d0a-492c-86df-eef48a414e23'
I1029 03:29:53.160143 =C2=A09292 slave.cpp:3399] Executor '279bcb3= 4-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' of = framework 20151029-031549-1294671788-5050-4937-0000 terminated with signal Killed
I1029 03:29:53.160884 =C2=A09292 slave.cpp:2696] Handling status updat= e TASK_FAILED (UUID: 27a2080a-8807-449e-9077-837ec45b4c51) for task 279bcb3= 4-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713 of frame= work 20151029-031549-1294671788-5050-4937-0000 from @0.0.0.0:0
W1029 03:29:53.161247 =C2=A09288 docker.cpp:986] Ignoring updating unk= nown container: d4f4e236-0d0a-492c-86df-eef48a414e23
I1029 03:29:53.161548 =C2=A09293 status_update_manager.cpp:322] Receiv= ed status update TASK_FAILED (UUID: 27a2080a-8807-449e-9077-837ec45b4c51) f= or task 279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240af= abf713 of framework 20151029-031549-1294671788-5050-4937-0000

I run master node with env:

SSL_SUPPORT_DOWNGRADE=3Dtrue
SSL_ENABLED=3Dtrue
SSL_KEY_FILE=3D/home/ubuntu/xx.key
SSL_CERT_FILE=3D/home/ubuntu/xx.pem

Slave node with env:

SSL_ENABLED=3Dtrue
SSL_KEY_FILE=3D/home/ubuntu/xx.key
SSL_CERT_FILE=3D/home/ubuntu/xx.pem
LIBPROCESS_ADVERTISE_IP=3Dxxx.xxx.xxx.xxx

When I remove all SSL envs. Slaves work well.

Did I miss sth?

Version:

Mesos 0.24.1
Maraton=C2=A00.9.2

OS
ubuntu 14.04



=E5=8F=91=E4=BB=B6=E4=BA=BA: Anindy= a Sinha <an= indya.sinha@gmail.com>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8828=E6=97=A5 =E6=98=9F=E6=9C=9F=E4=B8=89 =E4=B8=8B=E5=8D=882:32
=E8=87=B3: "user@mesos.apache.org" &l= t;user@mesos.apa= che.org>
=E4=B8=BB=E9=A2=98: Re: How to tell= master which ip to connect.



On Tue, Oct 27, 2015 at 7:43 PM, Xiaodong Zhang = <xdzhang@alauda.i= o> wrote:
It works! Thanks a lot.

Ok. So we should expose advertise_ip and advertise_port as command lin= e options for mesos-slave as well (instead of using the environment variabl= es)? Opened=C2=A0https://issues.apache.org/jira/browse/MESOS-3809.<= /div>
=C2=A0

Another question. Do masters and slaves communicate each other via a s= afety way=EF=BC=9FIs the data encrypted? I want to make sure deploy masters= and slaves into different IaaS is PROD-READY.

=E5=8F=91=E4=BB=B6=E4=BA=BA: haosde= nt <haosdent@gma= il.com>
=E7=AD=94=E5=A4=8D: "user@mesos.apache.org" <use= r@mesos.apache.org>
=E6=97=A5=E6=9C=9F: 2015=E5=B9=B410= =E6=9C=8828=E6=97=A5 =E6=98=9F=E6=9C=9F=E4=B8=89 =E4=B8=8A=E5=8D=8810:23 =E8=87=B3: user <user@mesos.apache.org> =E4=B8=BB=E9=A2=98: Re: How to tell= master which ip to connect.

Do you try `export LIBPROCESS_ADVERTISE_IP=3Dxxx` and `LIB= PROCESS_ADVERTISE_PORT` when start slave?

On Wed, Oct 28, 2015 at 10:16 AM, Xiaodong Zhang= <xdzhang@alauda.i= o> wrote:
Hi teams:

My scenarios is like this:=

My master nodes were deplo= yed in AWS. My slaves were in AZURE.So they communicate via public ip.
I got trouble when slaves = try to register to master.=C2=A0
Now slaves can get master= =E2=80=99s public ip address,and can send register request.But they can onl= y send there private ip to master.(Because they don=E2=80=99t know there pu= blic ip,thus they can=E2=80=99t not bind a public ip via =E2=80=94ip flag), thus =C2=A0masters can=E2=80=99t connec= t slaves.How can the slave to tell master which ip master should connect(I = can=E2=80=99t find any flags like =E2=80=94advertise_ip in master).



--
Best Regards,
Haosdent Huang




--
Best Regards,
Haosdent Huang



--
Best Regards,
Haosdent Huang



--
Best Regards,
Haosdent Huang




--
Best Regards,
Haosdent Huang
<5185_02_07.png><9D46724C-457C-4BE1-B0E4-F57B14= 7F6DC8.png><742629F2-78E8-43F2-9015-F3D22720826B.png>= <5185_02_04.png>






--
=
Best Regards,
Haosdent Huang



--
=
Best Regards,
Haosdent Huang
--047d7ba97b3ec638110523777341--