hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@apache.org>
Subject Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)
Date Tue, 09 Aug 2016 04:43:13 GMT
I updated all of our jobs to use the updated JDK versions from infra.
These have spaces in the names, and those names end up in our
workspace path, so try to keep an eye out.



On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey <busbey@cloudera.com> wrote:
> running in docker is the default now. relying on the default docker
> image that comes with Yetus means that our protoc checks are
> failing[1].
>
>
> [1]: https://issues.apache.org/jira/browse/HBASE-16373
>
> On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey <busbey@apache.org> wrote:
>> Hi folks!
>>
>> this morning I merged the patch that updates us to Yetus 0.3.0[1] and updated the
precommit job appropriately. I also changed it to use one of the Java versions post the puppet
changes to asf build.
>>
>> The last three builds look normal (#2975 - #2977). I'm gonna try running things in
docker next. I'll email again when I make it the default.
>>
>> [1]: https://issues.apache.org/jira/browse/HBASE-15882
>>
>> On 2016-06-16 10:43 (-0500), Sean Busbey <busbey@apache.org> wrote:
>>> FYI, today our precommit jobs started failing because our chosen jdk
>>> (1.7.0.79) disappeared (mentioned on HBASE-16032).
>>>
>>> Initially we were doing something wrong, namely directly referencing
>>> the jenkins build tools area without telling jenkins to give us an env
>>> variable that stated where the jdk is located. However, after
>>> attempting to switch to the appropriate tooling variable for jdk
>>> 1.7.0.79, I found that it didn't point to a place that worked.
>>>
>>> I've now updated the job to rely on the latest 1.7 jdk, which is
>>> currently 1.7.0.80. I don't know how often "latest" updates.
>>>
>>> Personally, I think this is a sign that we need to prioritize
>>> HBASE-15882 so that we can switch back to using Docker. I won't have
>>> time this week, so if anyone else does please pick up the ticket.
>>>
>>> On Thu, Mar 17, 2016 at 5:19 PM, Stack <stack@duboce.net> wrote:
>>> > Thanks Sean.
>>> > St.Ack
>>> >
>>> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey <busbey@cloudera.com>
wrote:
>>> >
>>> >> FYI, I updated the precommit job today to specify that only compile
time
>>> >> checks should be done against jdks other than the primary jdk7 instance.
>>> >>
>>> >> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey <busbey@cloudera.com>
wrote:
>>> >>
>>> >> > I tested things out, and while YETUS-297[1] is present the default
runs
>>> >> > all plugins that can do multiple jdks against those available (jdk7
and
>>> >> > jdk8 in our case).
>>> >> >
>>> >> > We can configure things to only do a single run of unit tests.
They'll be
>>> >> > against jdk7, since that is our default jdk. That fine by everyone?
It'll
>>> >> > save ~1.5 hours on any build that hits hbase-server.
>>> >> >
>>> >> > On Mon, Mar 7, 2016 at 1:22 PM, Stack <stack@duboce.net>
wrote:
>>> >> >
>>> >> >> Hurray!
>>> >> >>
>>> >> >> It looks like YETUS-96 is in there and we are only running
on jdk build
>>> >> >> now, the default (but testing compile against both).... Will
keep an
>>> >> eye.
>>> >> >>
>>> >> >> St.Ack
>>> >> >>
>>> >> >>
>>> >> >> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey <busbey@cloudera.com>
>>> >> wrote:
>>> >> >>
>>> >> >> > FYI, I've just updated our precommit jobs to use the 0.2.0
release of
>>> >> >> Yetus
>>> >> >> > that came out today.
>>> >> >> >
>>> >> >> > After keeping an eye out for strangeness today I'll turn
docker mode
>>> >> >> back
>>> >> >> > on by default tonight.
>>> >> >> >
>>> >> >> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey <busbey@apache.org>
>>> >> >> wrote:
>>> >> >> >
>>> >> >> > > FYI, I added a new parameter to the precommit job:
>>> >> >> > >
>>> >> >> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD
of the
>>> >> apache/yetus
>>> >> >> > > repo rather than our chosen release
>>> >> >> > >
>>> >> >> > > It defaults to inactive, but can be used in manually-triggered
runs
>>> >> to
>>> >> >> > > test a solution to a problem in the yetus library.
At the moment,
>>> >> I'm
>>> >> >> > > using it to test a solution to default module ordering
 as seen in
>>> >> >> > > HBASE-15075.
>>> >> >> > >
>>> >> >> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey <busbey@cloudera.com>
>>> >> >> wrote:
>>> >> >> > > > FYI, I just pushed HBASE-13525 (switch to Apache
Yetus for
>>> >> precommit
>>> >> >> > > tests)
>>> >> >> > > > and updated our jenkins precommit build to use
it.
>>> >> >> > > >
>>> >> >> > > > Jenkins job has some explanation:
>>> >> >> > > >
>>> >> >> > >
>>> >> >> >
>>> >> >>
>>> >> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
>>> >> >> > > >
>>> >> >> > > > Release note from HBASE-13525 does as well.
>>> >> >> > > >
>>> >> >> > > > The old job will stick around here for a couple
of weeks, in case
>>> >> we
>>> >> >> > need
>>> >> >> > > > to refer back to it:
>>> >> >> > > >
>>> >> >> > > >
>>> >> >> > >
>>> >> >> >
>>> >> >>
>>> >> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
>>> >> >> > > >
>>> >> >> > > > If something looks awry, please drop a note
on HBASE-13525 while
>>> >> it
>>> >> >> > > remains
>>> >> >> > > > open (and make a new issue after).
>>> >> >> > > >
>>> >> >> > > >
>>> >> >> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack <stack@duboce.net>
wrote:
>>> >> >> > > >
>>> >> >> > > >> As part of my continuing advocacy of builds.apache.org
and that
>>> >> >> their
>>> >> >> > > >> results are now worthy of our trust and
nurture, here are some
>>> >> >> > > highlights
>>> >> >> > > >> from the last few days of builds:
>>> >> >> > > >>
>>> >> >> > > >> + hadoopqa is now finding zombies before
the patch is committed.
>>> >> >> > > >> HBASE-14888 showed "-1 core tests. The patch
failed these unit
>>> >> >> tests:"
>>> >> >> > > but
>>> >> >> > > >> didn't have any failed tests listed (I'm
trying to see if I can
>>> >> do
>>> >> >> > > anything
>>> >> >> > > >> about this...). Running our little
>>> >> ./dev-tools/findHangingTests.py
>>> >> >> > > against
>>> >> >> > > >> the consoleText, it showed a hanging test.
Running locally, I see
>>> >> >> same
>>> >> >> > > >> hang. This is before the patch landed.
>>> >> >> > > >> + Our branch runs are now near totally zombie
and flakey free --
>>> >> >> still
>>> >> >> > > some
>>> >> >> > > >> work to do -- but a recent patch that seemed
harmless was
>>> >> causing a
>>> >> >> > > >> reliable flake fail in the backport to branch-1*
confirmed by
>>> >> local
>>> >> >> > > runs.
>>> >> >> > > >> The flakeyness was plain to see up in builds.apache.org.
>>> >> >> > > >> + In the last few days I've committed a
patch that included
>>> >> javadoc
>>> >> >> > > >> warnings even though hadoopqa said the patch
introduced javadoc
>>> >> >> issues
>>> >> >> > > (I
>>> >> >> > > >> missed it). This messed up life for folks
subsequently as their
>>> >> >> > patches
>>> >> >> > > now
>>> >> >> > > >> reported javadoc issues....
>>> >> >> > > >>
>>> >> >> > > >> In short, I suggest that builds.apache.org
is worth keeping an
>>> >> eye
>>> >> >> > on,
>>> >> >> > > >> make
>>> >> >> > > >> sure you get a clean build out of hadoopqa
before committing
>>> >> >> anything,
>>> >> >> > > and
>>> >> >> > > >> lets all work together to try and keep our
builds blue: it'll
>>> >> save
>>> >> >> us
>>> >> >> > > all
>>> >> >> > > >> work in the long run.
>>> >> >> > > >>
>>> >> >> > > >> St.Ack
>>> >> >> > > >>
>>> >> >> > > >>
>>> >> >> > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack <stack@duboce.net>
wrote:
>>> >> >> > > >>
>>> >> >> > > >> > Branch-1 and master have stabilized
and now run mostly blue
>>> >> >> (give or
>>> >> >> > > take
>>> >> >> > > >> > the odd failure) [1][2]. Having a mostly
blue branch-1 has
>>> >> >> helped us
>>> >> >> > > >> > identify at least one destabilizing
commit in the last few
>>> >> days,
>>> >> >> > maybe
>>> >> >> > > >> two;
>>> >> >> > > >> > this is as it should be (smile).
>>> >> >> > > >> >
>>> >> >> > > >> > Lets keep our builds blue. If you commit
a patch, make sure
>>> >> >> > subsequent
>>> >> >> > > >> > builds stay blue. You can subscribe
to builds@hbase.apache.org
>>> >> >> to
>>> >> >> > get
>>> >> >> > > >> > notice of failures if not already subscribed.
>>> >> >> > > >> >
>>> >> >> > > >> > Thanks,
>>> >> >> > > >> > St.Ack
>>> >> >> > > >> >
>>> >> >> > > >> > 1.
>>> >> https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
>>> >> >> > > >> > 2.
>>> >> >> https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
>>> >> >> > > >> >
>>> >> >> > > >> >
>>> >> >> > > >> > On Mon, Oct 13, 2014 at 4:41 PM, Stack
<stack@duboce.net>
>>> >> wrote:
>>> >> >> > > >> >
>>> >> >> > > >> >> A few notes on testing.
>>> >> >> > > >> >>
>>> >> >> > > >> >> Too long to read, infra is more
capable now and after some
>>> >> >> work, we
>>> >> >> > > are
>>> >> >> > > >> >> seeing branch-1 and trunk mostly
running blue. Lets try and
>>> >> >> keep it
>>> >> >> > > this
>>> >> >> > > >> >> way going forward.
>>> >> >> > > >> >>
>>> >> >> > > >> >> Apache Infra has new, more capable
hardware.
>>> >> >> > > >> >>
>>> >> >> > > >> >> A recent spurt of test fixing combined
with more capable
>>> >> >> hardware
>>> >> >> > > seems
>>> >> >> > > >> >> to have gotten us to a new place;
tests are mostly passing now
>>> >> >> on
>>> >> >> > > >> branch-1
>>> >> >> > > >> >> and master.  Lets try and keep
it this way and start to trust
>>> >> >> our
>>> >> >> > > test
>>> >> >> > > >> runs
>>> >> >> > > >> >> again.  Just a few flakies remain.
 Lets try and nail them.
>>> >> >> > > >> >>
>>> >> >> > > >> >> Our tests now run in parallel with
other test suites where
>>> >> >> previous
>>> >> >> > > we
>>> >> >> > > >> >> ran alone. You can see this sometimes
when our zombie detector
>>> >> >> > > reports
>>> >> >> > > >> >> tests from another project altogether
as lingerers (To be
>>> >> >> fixed).
>>> >> >> > > Some
>>> >> >> > > >> of
>>> >> >> > > >> >> our tests are failing because a
concurrent hbase run is
>>> >> undoing
>>> >> >> > > classes
>>> >> >> > > >> and
>>> >> >> > > >> >> data from under it. Also, lets
fix.
>>> >> >> > > >> >>
>>> >> >> > > >> >> Our tests are brittle. It takes
75minutes for them to
>>> >> complete.
>>> >> >> > Many
>>> >> >> > > >> are
>>> >> >> > > >> >> heavy-duty integration tests starting
up multiple clusters and
>>> >> >> > > mapreduce
>>> >> >> > > >> >> all in the one JVM. It is a miracle
they pass at all.  Usually
>>> >> >> > > >> integration
>>> >> >> > > >> >> tests have been cast as unit tests
because there was no where
>>> >> >> else
>>> >> >> > > for
>>> >> >> > > >> them
>>> >> >> > > >> >> to get an airing.  We have the
hbase-it suite now which would
>>> >> >> be a
>>> >> >> > > more
>>> >> >> > > >> apt
>>> >> >> > > >> >> place but until these are run on
a regular basis in public for
>>> >> >> all
>>> >> >> > to
>>> >> >> > > >> see,
>>> >> >> > > >> >> the fat integration tests disguised
as unit tests will remain.
>>> >> >> A
>>> >> >> > > >> review of
>>> >> >> > > >> >> our current unit tests weeding
the old cruft and the no longer
>>> >> >> > > relevant
>>> >> >> > > >> or
>>> >> >> > > >> >> duplicates would be a nice undertaking
if someone is looking
>>> >> to
>>> >> >> > > >> contribute.
>>> >> >> > > >> >>
>>> >> >> > > >> >> Alex Newman has been working on
making our tests work up on
>>> >> >> travis
>>> >> >> > > and
>>> >> >> > > >> >> circle-ci.  That'll be sweet when
it goes end-to-end.  He also
>>> >> >> > added
>>> >> >> > > in
>>> >> >> > > >> >> some "type" categorizations --
client, filter, mapreduce --
>>> >> >> > alongside
>>> >> >> > > >> our
>>> >> >> > > >> >> old "sizing" categorizations of
small/medium/large.  His
>>> >> >> thinking
>>> >> >> > is
>>> >> >> > > >> that
>>> >> >> > > >> >> we can run these categorizations
in parallel so we could run
>>> >> the
>>> >> >> > > total
>>> >> >> > > >> >> suite in about the time of the
longest test, say 20-30minutes?
>>> >> >> We
>>> >> >> > > could
>>> >> >> > > >> >> even change Apache to run them
this way.
>>> >> >> > > >> >>
>>> >> >> > > >> >> FYI,
>>> >> >> > > >> >> St.Ack
>>> >> >> > > >> >>
>>> >> >> > > >> >>
>>> >> >> > > >> >>
>>> >> >> > > >> >>
>>> >> >> > > >> >>
>>> >> >> > > >> >>
>>> >> >> > > >> >>
>>> >> >> > > >> >
>>> >> >> > > >>
>>> >> >> > > >
>>> >> >> > > >
>>> >> >> > > >
>>> >> >> > > > --
>>> >> >> > > > Sean
>>> >> >> > >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> > --
>>> >> >> > busbey
>>> >> >> >
>>> >> >>
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > busbey
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> busbey
>>> >>
>>>
>
>
>
> --
> busbey

Mime
View raw message