Mailing-List: contact dev-help@spark.apache.org; run by ezmlm
Precedence: bulk
Received-SPF: unknown (athena.apache.org: error in processing during lookup of
 sknapp@berkeley.edu)
MIME-Version: 1.0
In-Reply-To: 
 <CAOhmDzfR0iiH=HhcfB0wdYf=HLEM1QQDfDMW0JtRZ51vXbb3-g@mail.gmail.com>
References: 
 <CAAOnQ7tG2eKneh3-dEBf7YxQNmhwWU0LmAtzT6gT-+BJe0as7g@mail.gmail.com>
 <CABPQxsvw_hLEgbSwnqTb2fk=nvfv_62Gb2OgHO9rA7LbWjjD8A@mail.gmail.com>
 <3EEE3E33-B165-40ED-891B-13139D544DD7@hortonworks.com>
 <CAOhmDzfR0iiH=HhcfB0wdYf=HLEM1QQDfDMW0JtRZ51vXbb3-g@mail.gmail.com>
From: shane knapp <sknapp@berkeley.edu>
Date: Thu, 2 Apr 2015 08:59:37 -0700
Message-ID: 
 <CACdU-dQGm2RowsrWbfh0F9K1ZYfHE_NnVhJoDLkbJ0VxCHRMLw@mail.gmail.com>
Subject: Re: Unit test logs in Jenkins?
To: Nicholas Chammas <nicholas.chammas@gmail.com>
Cc: Steve Loughran <stevel@hortonworks.com>,
 Apache Spark Dev <dev@spark.apache.org>
Content-Type: multipart/alternative; boundary=089e0158c4586414960512bfece2

--089e0158c4586414960512bfece2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

i agree with all of this.  but can we please break up the tests and make
them shorter?  :)

On Thu, Apr 2, 2015 at 8:54 AM, Nicholas Chammas <nicholas.chammas@gmail.co=
m
> wrote:

> This is secondary to Marcelo=E2=80=99s question, but I wanted to comment =
on this:
>
> Its main limitation is more cultural than technical: you need to get peop=
le
> to care about intermittent test runs, otherwise you can end up with
> failures that nobody keeps on top of
>
> This is a problem that plagues Spark as well, but there *is* a technical
> solution.
>
> The solution is simple: *All* the builds that we care about run for *ever=
y*
> proposed change. If *any* build fails, the change doesn=E2=80=99t make it=
 into the
> repository.
>
> Spark already has a pull request builder that tests and reports back on
> PRs. Committers don=E2=80=99t merge in PRs when this builder reports that=
 it failed
> some tests. That=E2=80=99s a good thing.
>
> The problem is that there are several other builds that we run on a fixed
> interval, independent of the pull request builder. These builds test
> different configurations, dependency versions, and environments than what
> the PR builder covers. If one of those builds fails, it fails on its own
> little island, with no-one to hear it scream. The build failure is detach=
ed
> from the PR that caused it to fail.
>
> What should happen is that the whole matrix of stuff we care to test gets
> run for every PR. No PR goes in if any build we care about fails for that
> PR, and every build we care about runs for every commit of every PR.
>
> Really, this is just an extension of the basic idea of the PR builder. It
> doesn=E2=80=99t make much sense to test stuff *after* it has been committ=
ed and
> potentially broken things. And it becomes exponentially more difficult to
> find and fix a problem the longer it has been festering in the repo. It=
=E2=80=99s
> best to keep such problems out in the first place.
>
> With some more work on our CI infrastructure, I think this can be done.
> Maybe even later this year.
>
> Nick
>
> On Thu, Apr 2, 2015 at 6:02 AM Steve Loughran stevel@hortonworks.com
> <http://mailto:stevel@hortonworks.com> wrote:
>
>
> > > On 2 Apr 2015, at 06:31, Patrick Wendell <pwendell@gmail.com> wrote:
> > >
> > > Hey Marcelo,
> > >
> > > Great question. Right now, some of the more active developers have an
> > > account that allows them to log into this cluster to inspect logs (we
> > > copy the logs from each run to a node on that cluster). The
> > > infrastructure is maintained by the AMPLab.
> > >
> > > I will put you in touch the someone there who can get you an account.
> > >
> > > This is a short term solution. The longer term solution is to have
> > > these scp'd regularly to an S3 bucket or somewhere people can get
> > > access to them, but that's not ready yet.
> > >
> > > - Patrick
> > >
> > >>
> >
> >
> > ASF Jenkins is always there to play with; committers/PMC members should
> > just need to file a BUILD JIRA to get access.
> >
> > Its main limitation is more cultural than technical: you need to get
> > people to care about intermittent test runs, otherwise you can end up
> with
> > failures that nobody keeps on top of
> > https://builds.apache.org/view/H-L/view/Hadoop/
> >
> > Someone really needs to own the "keep the builds working" problem -and
> > have the ability to somehow kick others into fixing things. The latter =
is
> > pretty hard cross-organisation
> >
> >
> > >> That would be really helpful to debug build failures. The scalatest
> > >> output isn't all that helpful.
> > >>
> >
> > Potentially an issue with the test runner, rather than the tests
> > themselves.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> > For additional commands, e-mail: dev-help@spark.apache.org
> >
> >  =E2=80=8B
>

--089e0158c4586414960512bfece2--