spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shane knapp <skn...@berkeley.edu>
Subject Re: amplab jenkins is down
Date Fri, 05 Sep 2014 05:02:55 GMT
yep.  that's exactly the behavior i saw earlier, and will be figuring out
first thing tomorrow morning.  i bet it's an environment issues on the
slaves.


On Thu, Sep 4, 2014 at 7:10 PM, Nicholas Chammas <nicholas.chammas@gmail.com
> wrote:

> Looks like during the last build
> <https://amplab.cs.berkeley.edu/jenkins/view/Pull%20Request%20Builders/job/SparkPullRequestBuilder/19797/console>
> Jenkins was unable to execute a git fetch?
>
>
> On Thu, Sep 4, 2014 at 7:58 PM, shane knapp <sknapp@berkeley.edu> wrote:
>
>> i'm going to restart jenkins and see if that fixes things.
>>
>>
>> On Thu, Sep 4, 2014 at 4:56 PM, shane knapp <sknapp@berkeley.edu> wrote:
>>
>>> looking
>>>
>>>
>>> On Thu, Sep 4, 2014 at 4:21 PM, Nicholas Chammas <
>>> nicholas.chammas@gmail.com> wrote:
>>>
>>>> It appears that our main man is having trouble
>>>> <https://amplab.cs.berkeley.edu/jenkins/view/Pull%20Request%20Builders/job/SparkPullRequestBuilder/>
>>>>  hearing new requests
>>>> <https://github.com/apache/spark/pull/2277#issuecomment-54549106>.
>>>>
>>>> Do we need some smelling salts?
>>>>
>>>>
>>>> On Thu, Sep 4, 2014 at 5:49 PM, shane knapp <sknapp@berkeley.edu>
>>>> wrote:
>>>>
>>>>> i'd ping the Jenkinsmench...  the master was completely offline, so
>>>>> any new
>>>>> jobs wouldn't have reached it.  any jobs that were queued when power
>>>>> was
>>>>> lost probably started up, but jobs that were running would fail.
>>>>>
>>>>>
>>>>> On Thu, Sep 4, 2014 at 2:45 PM, Nicholas Chammas <
>>>>> nicholas.chammas@gmail.com
>>>>> > wrote:
>>>>>
>>>>> > Woohoo! Thanks Shane.
>>>>> >
>>>>> > Do you know if queued PR builds will automatically be picked up?
Or
>>>>> do we
>>>>> > have to ping the Jenkinmensch manually from each PR?
>>>>> >
>>>>> > Nick
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 4, 2014 at 5:37 PM, shane knapp <sknapp@berkeley.edu>
>>>>> wrote:
>>>>> >
>>>>> >> AND WE'RE UP!
>>>>> >>
>>>>> >> sorry that this took so long...  i'll send out a more detailed
>>>>> explanation
>>>>> >> of what happened soon.
>>>>> >>
>>>>> >> now, off to back up jenkins.
>>>>> >>
>>>>> >> shane
>>>>> >>
>>>>> >>
>>>>> >> On Thu, Sep 4, 2014 at 1:27 PM, shane knapp <sknapp@berkeley.edu>
>>>>> wrote:
>>>>> >>
>>>>> >> > it's a faulty power switch on the firewall, which has been
>>>>> swapped out.
>>>>> >> >  we're about to reboot and be good to go.
>>>>> >> >
>>>>> >> >
>>>>> >> > On Thu, Sep 4, 2014 at 1:19 PM, shane knapp <sknapp@berkeley.edu>
>>>>> >> wrote:
>>>>> >> >
>>>>> >> >> looks like some hardware failed, and we're swapping
in a
>>>>> replacement.
>>>>> >> i
>>>>> >> >> don't have more specific information yet -- including
*what*
>>>>> failed,
>>>>> >> as our
>>>>> >> >> sysadmin is super busy ATM.  the root cause was an
incorrect
>>>>> circuit
>>>>> >> being
>>>>> >> >> switched off during building maintenance.
>>>>> >> >>
>>>>> >> >> on a side note, this incident will be accelerating
our plan to
>>>>> move the
>>>>> >> >> entire jenkins infrastructure in to a managed datacenter
>>>>> environment.
>>>>> >> this
>>>>> >> >> will be our major push over the next couple of weeks.
 more
>>>>> details
>>>>> >> about
>>>>> >> >> this, also, as soon as i get them.
>>>>> >> >>
>>>>> >> >> i'm very sorry about the downtime, we'll get everything
up and
>>>>> running
>>>>> >> >> ASAP.
>>>>> >> >>
>>>>> >> >>
>>>>> >> >> On Thu, Sep 4, 2014 at 12:27 PM, shane knapp <
>>>>> sknapp@berkeley.edu>
>>>>> >> wrote:
>>>>> >> >>
>>>>> >> >>> looks like a power outage in soda hall.  more updates
as they
>>>>> happen.
>>>>> >> >>>
>>>>> >> >>>
>>>>> >> >>> On Thu, Sep 4, 2014 at 12:25 PM, shane knapp <
>>>>> sknapp@berkeley.edu>
>>>>> >> >>> wrote:
>>>>> >> >>>
>>>>> >> >>>> i am trying to get things up and running, but
it looks like
>>>>> either
>>>>> >> the
>>>>> >> >>>> firewall gateway or jenkins server itself is
down.  i'll
>>>>> update as
>>>>> >> soon as
>>>>> >> >>>> i know more.
>>>>> >> >>>>
>>>>> >> >>>
>>>>> >> >>>
>>>>> >> >>
>>>>> >> >
>>>>> >>
>>>>> >
>>>>> >  --
>>>>> > You received this message because you are subscribed to the Google
>>>>> Groups
>>>>> > "amp-infra" group.
>>>>> > To unsubscribe from this group and stop receiving emails from it,
>>>>> send an
>>>>> > email to amp-infra+unsubscribe@googlegroups.com.
>>>>> > For more options, visit https://groups.google.com/d/optout.
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message