mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Whitney Sorenson <wsoren...@hubspot.com>
Subject Re: Jenkins mesos plugin failing
Date Thu, 07 Nov 2013 20:31:29 GMT
I should also point out the scheduler didn't seem to survive a reboot of
Jenkins - I had to delete the mesos cloud and reenter the parameters.


On Thu, Nov 7, 2013 at 3:26 PM, Whitney Sorenson <wsorenson@hubspot.com>wrote:

> Looks like we're using authentication on our slaves. So you either need to
> pass
>
> -jnlpCredentials user:pass
>
> on the command line, or change around the permissions in Jenkins to allow
> anonymous users to connect/run jobs.
>
> I'm not sure if it would make sense or not to add the user/pass in the
> Jenkins plugin configuration screen or if it should be fetched another way.
>
>
>
>
> On Thu, Nov 7, 2013 at 2:52 PM, Vinod Kone <vinodkone@gmail.com> wrote:
>
>> Great. Let us know once you figure it out. Maybe I can add a FAQ to the
>> plugin's README to help others (or you can contribute too :)).
>>
>>
>> On Thu, Nov 7, 2013 at 11:40 AM, Whitney Sorenson <wsorenson@hubspot.com>wrote:
>>
>>> I added the jenkins user on the slave - this was the missing piece. I'll
>>> add this to my PR for the readme. Got much further now; now I'm getting a
>>> 403 on the fetch:
>>>
>>> /jenkins/computer/mesos-jenkins-6f4719c8-1c61-4b28-b5ab-ba298e846840/slave-agent.jnlp:
>>> 403 Forbidden at
>>> hudson.remoting.Launcher.parseJnlpArguments(Launcher.java:261) at
>>> hudson.remoting.Launcher.run(Launcher.java:215)
>>>
>>> and corresponding log on jenkins master:
>>>
>>> Nov 7, 2013 2:38:39 PM winstone.Logger logInternal INFO: While serving
>>> http://localhost:8080/jenkins/computer/mesos-jenkins-6f4719c8-1c61-4b28-b5ab-ba298e846840/slave-agent.jnlp:
>>> hudson.security.AccessDeniedException2: anonymous is missing the
>>> Slave/Connect permission
>>>
>>> Going to look into what this means.
>>>
>>>
>>>
>>> On Thu, Nov 7, 2013 at 2:21 PM, Vinod Kone <vinodkone@gmail.com> wrote:
>>>
>>>> I looked at the code and it looks there are few places the executor
>>>> might fail before it fetches the URI. Most of them have to do with
>>>> incorrect permissions. The code was written to have any errors reported
>>>> either in slave log or console or executor logs (there might be a bug here
>>>> if we are in fact swallowing errors). IIUC, the executor log directory is
>>>> empty in your case which suggests the executor died before it could even
>>>> create "stdout" or "stderr" files in its sandbox (Is this true?).
>>>>
>>>> Couple of questions:
>>>>
>>>> What user is Jenkins master running as? Is that user known to the host
>>>> on which mesos slave is running?
>>>>
>>>> How are you starting the mesos slave (e.g., cmd line flags)?
>>>>
>>>>
>>>>
>>>> On Thu, Nov 7, 2013 at 11:00 AM, Whitney Sorenson <
>>>> wsorenson@hubspot.com> wrote:
>>>>
>>>>> The gist was compiled from that log. Here is the complete log from
>>>>> toggling the jenkins plugin on / off (you see the ping statements
>>>>> inbetween):
>>>>>
>>>>> https://gist.github.com/wsorenson/8bf64e44fd42da354fa0
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Nov 7, 2013 at 1:57 PM, Vinod Kone <vinodkone@gmail.com>wrote:
>>>>>
>>>>>> What does mesos-slave.err say?
>>>>>>
>>>>>>
>>>>>> On Thu, Nov 7, 2013 at 10:49 AM, Whitney Sorenson <
>>>>>> wsorenson@hubspot.com> wrote:
>>>>>>
>>>>>>> Hi Vinod,
>>>>>>>
>>>>>>> It's 0.14.0-rc4 in both.
>>>>>>>
>>>>>>> I believe we have logging working:
>>>>>>>
>>>>>>> -rw-r--r-- 1 root root         0 Oct 22 23:48 mesos-slave.out
>>>>>>> lrwxrwxrwx 1 root root        63 Oct 22 23:48 mesos-slave.INFO
->
>>>>>>> mesos-slave.carousel.invalid-user.log.INFO.20131022-234823.5797
>>>>>>> lrwxrwxrwx 1 root root        66 Oct 22 23:49 mesos-slave.WARNING
->
>>>>>>> mesos-slave.carousel.invalid-user.log.WARNING.20131022-234954.5797
>>>>>>> drwxr-xr-x 2 root root      4096 Oct 22 23:49 .
>>>>>>> -rw-rw-r-- 1 root root      4827 Nov  1 20:34
>>>>>>> mesos-slave.carousel.invalid-user.log.WARNING.20131022-234954.5797
>>>>>>> -rw-rw-r-- 1 root root  10408140 Nov  7 18:44
>>>>>>> mesos-slave.carousel.invalid-user.log.INFO.20131022-234823.5797
>>>>>>> -rw-r--r-- 1 root root  53759705 Nov  7 18:45 mesos-slave.err
>>>>>>>
>>>>>>> Is there something else to check? Is it possible the executor
is
>>>>>>> failing before it even attempts to fetch URIs?
>>>>>>>
>>>>>>> Ray - Thanks - yeah I found the jenkins logs. I was able to wget
the
>>>>>>> slave.jar, and even run it. The mesos-jenkins slaves are dead
now, so I
>>>>>>> can't connect to their slave-agent - but the jar does run. Not
sure if the
>>>>>>> window for trying to connect to one of the mesos launched slaves
is long
>>>>>>> enough to try before it is terminated due to failures. Interestingly,
when
>>>>>>> I try to connect to one of the existing slaves I get a 403.
>>>>>>>
>>>>>>> -Whitney
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Nov 7, 2013 at 1:34 PM, Vinod Kone <vinodkone@gmail.com>wrote:
>>>>>>>
>>>>>>>> Hey Whitney,
>>>>>>>>
>>>>>>>> What version of mesos are you using (both in the cluster
and the
>>>>>>>> plugin)?
>>>>>>>>
>>>>>>>> The slave should print stuff to console when it is launching
>>>>>>>> executor (e.g., "Fetching resources..."). I don't see that
in the gist you
>>>>>>>> pasted. Are you capturing stdout/stderr of the slave?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Nov 7, 2013 at 10:30 AM, Whitney Sorenson <
>>>>>>>> wsorenson@hubspot.com> wrote:
>>>>>>>>
>>>>>>>>> Thanks Ray.
>>>>>>>>>
>>>>>>>>> I have very similar issue (empty executor directories)
- but don't
>>>>>>>>> have any issues curling the slave.jar URI - and I don't
have any existing
>>>>>>>>> JNLP process running. I don't have a jenkins user - is
that the only setup
>>>>>>>>> you did on the slave?
>>>>>>>>>
>>>>>>>>> -Whitney
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Nov 7, 2013 at 1:16 PM, Ray Rodriguez <
>>>>>>>>> rayrod2030@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Whitney I would have a look at this github issue
where I work
>>>>>>>>>> through some of my jenkins mesos-plugin issues with
Vinod.  Might be some
>>>>>>>>>> of the same issues you are seeing.
>>>>>>>>>> https://github.com/jenkinsci/mesos-plugin/issues/2
>>>>>>>>>>
>>>>>>>>>> Ray
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Nov 7, 2013 at 1:07 PM, Whitney Sorenson
<
>>>>>>>>>> wsorenson@hubspot.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all!
>>>>>>>>>>>
>>>>>>>>>>> I am trying to get the Jenkins Mesos plugin functioning.
I was
>>>>>>>>>>> able to get it installed on our Jenkins master.
>>>>>>>>>>>
>>>>>>>>>>> However, it's unclear if there are any required
steps for
>>>>>>>>>>> setting up the slaves. When a framework task
is launched, it fails
>>>>>>>>>>> instantly and there are no logs in the runs folder.
>>>>>>>>>>>
>>>>>>>>>>> Here's a gist with relevant logs from the slave:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://gist.github.com/wsorenson/b3562c3e4a8992f9a46f/raw/ea5821c442d826456291330452208d8d7ac8418f/failing+jenkins+logs
>>>>>>>>>>>
>>>>>>>>>>> Any help on how to debug? At first, I thought
maybe we needed
>>>>>>>>>>> slave.jar or something but it looks like it's
trying to fetch that from the
>>>>>>>>>>> master using the URIs. To clarify, I have done
no special jenkins related
>>>>>>>>>>> setup (as per readme.md) on any of the slaves.
>>>>>>>>>>>
>>>>>>>>>>> -Whitney
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message