hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Antonov <olorinb...@gmail.com>
Subject Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)
Date Tue, 13 Sep 2016 00:01:11 GMT
Great work indeed!

Agreed, occasional failed runs may not be that bad, but fairly regular
failed runs ruin the idea of CI. Especially for released or otherwise
supposedly stable branches.

-Mikhail

On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey <busbey@cloudera.com> wrote:

> awesome work Appy!
>
> That's certainly good news to hear.
>
> On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma <appy@cloudera.com>
> wrote:
> > On a separate note:
> > Trunk had 8 green runs in last 3 days! (
> > https://builds.apache.org/job/HBase-Trunk_matrix/)
> > This was due to fixing just the mass failures on trunk and no change in
> > flaky infra. Which made me to conclude two things:
> > 1. Flaky infra works.
> > 2. It relies heavily on the post-commit build's stability (which every
> > project should anyways strive for). If the build fails catastrophically
> > once in a while, we can just exclude that one run using a flag and
> > everything will work, but if it happens frequently, then it won't work
> > right.
> >
> > I have re-enabled Flaky tests job (
> > https://builds.apache.org/view/H-L/view/HBase/job/HBASE-Flaky-Tests/)
> which
> > was disabled for almost a month due to trunk being on fire.
> > I will keep an eye on how things are going.
> >
> >
> > On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma <appy@cloudera.com>
> wrote:
> >
> >> @Sean, Mikhail: I found the alternate solution. Using user defined axis,
> >> tool environment and env variable injection.
> >> See latest diff to https://builds.apache.org/job/HBase-Trunk_matrix/
> job
> >> for reference.
> >>
> >>
> >> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov <olorinbant@gmail.com>
> >> wrote:
> >>
> >>> FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3 and
> >>> hbase-1.3-IT jobs and instead created
> >>>
> >>> https://builds.apache.org/job/HBase-1.3-JDK8 and
> >>> https://builds.apache.org/job/HBase-1.3-JDK7
> >>>
> >>> This should work for now until we figure out how to move forward.
> >>>
> >>> Thanks,
> >>> Mikhail
> >>>
> >>> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey <busbey@cloudera.com>
> wrote:
> >>>
> >>> > /me smacks forehead
> >>> >
> >>> > these replacement jobs, of course, also have special characters in
> >>> > their names which then show up in the working path.
> >>> >
> >>> > renaming them to skip spaces and parens.
> >>> >
> >>> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey <sean.busbey@gmail.com>
> >>> > wrote:
> >>> > > FYI, it looks like essentially our entire CI suite is red, probably
> >>> due
> >>> > to
> >>> > > parts of our codebase not tolerating spaces or other special
> >>> characters
> >>> > in
> >>> > > the working directory.
> >>> > >
> >>> > > I've made a stop-gap non-multi-configuration set of jobs for
> running
> >>> unit
> >>> > > tests for the 1.2 branch against JDK 7 and JDK 8:
> >>> > >
> >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> >>> > 201.2%20(JDK%201.7)/
> >>> > >
> >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> >>> > 201.2%20(JDK%201.8)/
> >>> > >
> >>> > > Due to the lack of response from infra@ I suspect our only options
> >>> for
> >>> > > continuing on ASF infra is to fix whatever part of our build
> doesn't
> >>> > > tolerate the new paths, or stop using multiconfiguration
> deployments.
> >>> I
> >>> > am
> >>> > > obviously less than thrilled at the idea of having several
> multiples
> >>> of
> >>> > > current jobs.
> >>> > >
> >>> > >
> >>> > > On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey <busbey@cloudera.com>
> >>> > wrote:
> >>> > >
> >>> > >> Ugh.
> >>> > >>
> >>> > >> I sent a reply to Gav on builds@ about maybe getting names that
> >>> don't
> >>> > >> have spaces in them:
> >>> > >>
> >>> > >> https://lists.apache.org/thread.html/
> 8ac03dc62f9d6862d4f3d5eb37119c
> >>> > >> 9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E
> >>> > >>
> >>> > >> In the mean time, is this an issue we need file with Hadoop or
> >>> > >> something we need to fix in our own code?
> >>> > >>
> >>> > >> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
> >>> > >> <theo.bertozzi@gmail.com> wrote:
> >>> > >> > There are a bunch of builds that have most of the test failing.
> >>> > >> >
> >>> > >> > Example:
> >>> > >> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=
> >>> > >> JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/
> >>> > >> org.apache.hadoop.hbase/TestLocalHBaseCluster/
> testLocalHBaseCluster/
> >>> > >> >
> >>> > >> > from the stack trace looks like the problem is with the jdk name
> >>> that
> >>> > has
> >>> > >> > spaces:
> >>> > >> > the hadoop FsVolumeImpl calls setNameFormat(... +
> >>> fileName.toString()
> >>> > +
> >>> > >> ...)
> >>> > >> > and this seems to not be escaped
> >>> > >> > so we end up with JDK%25201.7%2520(latest) in the string format
> >>> and we
> >>> > >> get
> >>> > >> > a IllegalFormatPrecisionException: 7
> >>> > >> >
> >>> > >> > 2016-08-10 22:07:46,108 WARN  [DataNode:
> >>> > >> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> >>> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> >>> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
> >>> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
> >>> > >> 9c88f385e6f1/dfs/data/data1/,
> >>> > >> > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> >>> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> >>> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
> >>> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
> >>> > >> 9c88f385e6f1/dfs/data/data2/]]
> >>> > >> >  heartbeating to localhost/127.0.0.1:34629]
> >>> > >> > datanode.BPServiceActor(831): Unexpected exception in block pool
> >>> Block
> >>> > >> > pool <registering> (Datanode Uuid unassigned) service to
> >>> > >> > localhost/127.0.0.1:34629
> >>> > >> > java.util.IllegalFormatPrecisionException: 7
> >>> > >> >         at java.util.Formatter$FormatSpecifier.checkText(
> >>> > >> Formatter.java:2984)
> >>> > >> >         at java.util.Formatter$FormatSpecifier.<init>(
> >>> > >> Formatter.java:2688)
> >>> > >> >         at java.util.Formatter.parse(Formatter.java:2528)
> >>> > >> >         at java.util.Formatter.format(Formatter.java:2469)
> >>> > >> >         at java.util.Formatter.format(Formatter.java:2423)
> >>> > >> >         at java.lang.String.format(String.java:2792)
> >>> > >> >         at com.google.common.util.concurrent.
> ThreadFactoryBuilder.
> >>> > >> setNameFormat(ThreadFactoryBuilder.java:68)
> >>> > >> >         at org.apache.hadoop.hdfs.server.
> datanode.fsdataset.impl.
> >>> > >> FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:140)
> >>> > >> >
> >>> > >> >
> >>> > >> >
> >>> > >> > Matteo
> >>> > >> >
> >>> > >> >
> >>> > >> > On Tue, Aug 9, 2016 at 9:55 AM, Stack <stack@duboce.net> wrote:
> >>> > >> >
> >>> > >> >> Good on you Sean.
> >>> > >> >> S
> >>> > >> >>
> >>> > >> >> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey <busbey@apache.org
> >
> >>> > wrote:
> >>> > >> >>
> >>> > >> >> > I updated all of our jobs to use the updated JDK versions
> from
> >>> > infra.
> >>> > >> >> > These have spaces in the names, and those names end up in our
> >>> > >> >> > workspace path, so try to keep an eye out.
> >>> > >> >> >
> >>> > >> >> >
> >>> > >> >> >
> >>> > >> >> > On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey <
> >>> busbey@cloudera.com>
> >>> > >> >> wrote:
> >>> > >> >> > > running in docker is the default now. relying on the
> default
> >>> > docker
> >>> > >> >> > > image that comes with Yetus means that our protoc checks
> are
> >>> > >> >> > > failing[1].
> >>> > >> >> > >
> >>> > >> >> > >
> >>> > >> >> > > [1]: https://issues.apache.org/jira/browse/HBASE-16373
> >>> > >> >> > >
> >>> > >> >> > > On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey <
> >>> busbey@apache.org>
> >>> > >> wrote:
> >>> > >> >> > >> Hi folks!
> >>> > >> >> > >>
> >>> > >> >> > >> this morning I merged the patch that updates us to Yetus
> >>> > 0.3.0[1]
> >>> > >> and
> >>> > >> >> > updated the precommit job appropriately. I also changed it to
> >>> use
> >>> > one
> >>> > >> of
> >>> > >> >> > the Java versions post the puppet changes to asf build.
> >>> > >> >> > >>
> >>> > >> >> > >> The last three builds look normal (#2975 - #2977). I'm
> gonna
> >>> try
> >>> > >> >> > running things in docker next. I'll email again when I make
> it
> >>> the
> >>> > >> >> default.
> >>> > >> >> > >>
> >>> > >> >> > >> [1]: https://issues.apache.org/jira/browse/HBASE-15882
> >>> > >> >> > >>
> >>> > >> >> > >> On 2016-06-16 10:43 (-0500), Sean Busbey <
> busbey@apache.org>
> >>> > >> wrote:
> >>> > >> >> > >>> FYI, today our precommit jobs started failing because our
> >>> > chosen
> >>> > >> jdk
> >>> > >> >> > >>> (1.7.0.79) disappeared (mentioned on HBASE-16032).
> >>> > >> >> > >>>
> >>> > >> >> > >>> Initially we were doing something wrong, namely directly
> >>> > >> referencing
> >>> > >> >> > >>> the jenkins build tools area without telling jenkins to
> give
> >>> > us an
> >>> > >> >> env
> >>> > >> >> > >>> variable that stated where the jdk is located. However,
> >>> after
> >>> > >> >> > >>> attempting to switch to the appropriate tooling variable
> for
> >>> > jdk
> >>> > >> >> > >>> 1.7.0.79, I found that it didn't point to a place that
> >>> worked.
> >>> > >> >> > >>>
> >>> > >> >> > >>> I've now updated the job to rely on the latest 1.7 jdk,
> >>> which
> >>> > is
> >>> > >> >> > >>> currently 1.7.0.80. I don't know how often "latest"
> updates.
> >>> > >> >> > >>>
> >>> > >> >> > >>> Personally, I think this is a sign that we need to
> >>> prioritize
> >>> > >> >> > >>> HBASE-15882 so that we can switch back to using Docker. I
> >>> won't
> >>> > >> have
> >>> > >> >> > >>> time this week, so if anyone else does please pick up the
> >>> > ticket.
> >>> > >> >> > >>>
> >>> > >> >> > >>> On Thu, Mar 17, 2016 at 5:19 PM, Stack <stack@duboce.net
> >
> >>> > wrote:
> >>> > >> >> > >>> > Thanks Sean.
> >>> > >> >> > >>> > St.Ack
> >>> > >> >> > >>> >
> >>> > >> >> > >>> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey <
> >>> > >> busbey@cloudera.com
> >>> > >> >> >
> >>> > >> >> > wrote:
> >>> > >> >> > >>> >
> >>> > >> >> > >>> >> FYI, I updated the precommit job today to specify that
> >>> only
> >>> > >> >> compile
> >>> > >> >> > time
> >>> > >> >> > >>> >> checks should be done against jdks other than the
> primary
> >>> > jdk7
> >>> > >> >> > instance.
> >>> > >> >> > >>> >>
> >>> > >> >> > >>> >> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey <
> >>> > >> busbey@cloudera.com>
> >>> > >> >> > wrote:
> >>> > >> >> > >>> >>
> >>> > >> >> > >>> >> > I tested things out, and while YETUS-297[1] is
> present
> >>> the
> >>> > >> >> > default runs
> >>> > >> >> > >>> >> > all plugins that can do multiple jdks against those
> >>> > available
> >>> > >> >> > (jdk7 and
> >>> > >> >> > >>> >> > jdk8 in our case).
> >>> > >> >> > >>> >> >
> >>> > >> >> > >>> >> > We can configure things to only do a single run of
> unit
> >>> > >> tests.
> >>> > >> >> > They'll be
> >>> > >> >> > >>> >> > against jdk7, since that is our default jdk. That
> fine
> >>> by
> >>> > >> >> > everyone? It'll
> >>> > >> >> > >>> >> > save ~1.5 hours on any build that hits hbase-server.
> >>> > >> >> > >>> >> >
> >>> > >> >> > >>> >> > On Mon, Mar 7, 2016 at 1:22 PM, Stack <
> >>> stack@duboce.net>
> >>> > >> wrote:
> >>> > >> >> > >>> >> >
> >>> > >> >> > >>> >> >> Hurray!
> >>> > >> >> > >>> >> >>
> >>> > >> >> > >>> >> >> It looks like YETUS-96 is in there and we are only
> >>> > running
> >>> > >> on
> >>> > >> >> > jdk build
> >>> > >> >> > >>> >> >> now, the default (but testing compile against
> >>> both)....
> >>> > Will
> >>> > >> >> > keep an
> >>> > >> >> > >>> >> eye.
> >>> > >> >> > >>> >> >>
> >>> > >> >> > >>> >> >> St.Ack
> >>> > >> >> > >>> >> >>
> >>> > >> >> > >>> >> >>
> >>> > >> >> > >>> >> >> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey <
> >>> > >> >> > busbey@cloudera.com>
> >>> > >> >> > >>> >> wrote:
> >>> > >> >> > >>> >> >>
> >>> > >> >> > >>> >> >> > FYI, I've just updated our precommit jobs to use
> the
> >>> > 0.2.0
> >>> > >> >> > release of
> >>> > >> >> > >>> >> >> Yetus
> >>> > >> >> > >>> >> >> > that came out today.
> >>> > >> >> > >>> >> >> >
> >>> > >> >> > >>> >> >> > After keeping an eye out for strangeness today
> I'll
> >>> > turn
> >>> > >> >> > docker mode
> >>> > >> >> > >>> >> >> back
> >>> > >> >> > >>> >> >> > on by default tonight.
> >>> > >> >> > >>> >> >> >
> >>> > >> >> > >>> >> >> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey <
> >>> > >> >> > busbey@apache.org>
> >>> > >> >> > >>> >> >> wrote:
> >>> > >> >> > >>> >> >> >
> >>> > >> >> > >>> >> >> > > FYI, I added a new parameter to the precommit
> job:
> >>> > >> >> > >>> >> >> > >
> >>> > >> >> > >>> >> >> > > * USE_YETUS_PRERELEASE - causes us to use the
> >>> HEAD of
> >>> > >> the
> >>> > >> >> > >>> >> apache/yetus
> >>> > >> >> > >>> >> >> > > repo rather than our chosen release
> >>> > >> >> > >>> >> >> > >
> >>> > >> >> > >>> >> >> > > It defaults to inactive, but can be used in
> >>> > >> >> > manually-triggered runs
> >>> > >> >> > >>> >> to
> >>> > >> >> > >>> >> >> > > test a solution to a problem in the yetus
> >>> library. At
> >>> > >> the
> >>> > >> >> > moment,
> >>> > >> >> > >>> >> I'm
> >>> > >> >> > >>> >> >> > > using it to test a solution to default module
> >>> > ordering
> >>> > >> as
> >>> > >> >> > seen in
> >>> > >> >> > >>> >> >> > > HBASE-15075.
> >>> > >> >> > >>> >> >> > >
> >>> > >> >> > >>> >> >> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey <
> >>> > >> >> > busbey@cloudera.com>
> >>> > >> >> > >>> >> >> wrote:
> >>> > >> >> > >>> >> >> > > > FYI, I just pushed HBASE-13525 (switch to
> Apache
> >>> > Yetus
> >>> > >> >> for
> >>> > >> >> > >>> >> precommit
> >>> > >> >> > >>> >> >> > > tests)
> >>> > >> >> > >>> >> >> > > > and updated our jenkins precommit build to
> use
> >>> it.
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > > Jenkins job has some explanation:
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > >
> >>> > >> >> > >>> >> >> >
> >>> > >> >> > >>> >> >>
> >>> > >> >> > >>> >> https://builds.apache.org/
> view/PreCommit%20Builds/job/
> >>> > >> >> > PreCommit-HBASE-Build/
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > > Release note from HBASE-13525 does as well.
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > > The old job will stick around here for a
> couple
> >>> of
> >>> > >> weeks,
> >>> > >> >> > in case
> >>> > >> >> > >>> >> we
> >>> > >> >> > >>> >> >> > need
> >>> > >> >> > >>> >> >> > > > to refer back to it:
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > >
> >>> > >> >> > >>> >> >> >
> >>> > >> >> > >>> >> >>
> >>> > >> >> > >>> >> https://builds.apache.org/
> view/PreCommit%20Builds/job/
> >>> > >> >> > PreCommit-HBASE-Build-deprecated/
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > > If something looks awry, please drop a note
> on
> >>> > >> >> HBASE-13525
> >>> > >> >> > while
> >>> > >> >> > >>> >> it
> >>> > >> >> > >>> >> >> > > remains
> >>> > >> >> > >>> >> >> > > > open (and make a new issue after).
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack <
> >>> > >> stack@duboce.net>
> >>> > >> >> > wrote:
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > >> As part of my continuing advocacy of
> >>> > >> builds.apache.org
> >>> > >> >> > and that
> >>> > >> >> > >>> >> >> their
> >>> > >> >> > >>> >> >> > > >> results are now worthy of our trust and
> >>> nurture,
> >>> > here
> >>> > >> >> are
> >>> > >> >> > some
> >>> > >> >> > >>> >> >> > > highlights
> >>> > >> >> > >>> >> >> > > >> from the last few days of builds:
> >>> > >> >> > >>> >> >> > > >>
> >>> > >> >> > >>> >> >> > > >> + hadoopqa is now finding zombies before the
> >>> > patch is
> >>> > >> >> > committed.
> >>> > >> >> > >>> >> >> > > >> HBASE-14888 showed "-1 core tests. The patch
> >>> > failed
> >>> > >> >> these
> >>> > >> >> > unit
> >>> > >> >> > >>> >> >> tests:"
> >>> > >> >> > >>> >> >> > > but
> >>> > >> >> > >>> >> >> > > >> didn't have any failed tests listed (I'm
> >>> trying to
> >>> > >> see
> >>> > >> >> if
> >>> > >> >> > I can
> >>> > >> >> > >>> >> do
> >>> > >> >> > >>> >> >> > > anything
> >>> > >> >> > >>> >> >> > > >> about this...). Running our little
> >>> > >> >> > >>> >> ./dev-tools/findHangingTests.py
> >>> > >> >> > >>> >> >> > > against
> >>> > >> >> > >>> >> >> > > >> the consoleText, it showed a hanging test.
> >>> Running
> >>> > >> >> > locally, I see
> >>> > >> >> > >>> >> >> same
> >>> > >> >> > >>> >> >> > > >> hang. This is before the patch landed.
> >>> > >> >> > >>> >> >> > > >> + Our branch runs are now near totally
> zombie
> >>> and
> >>> > >> flakey
> >>> > >> >> > free --
> >>> > >> >> > >>> >> >> still
> >>> > >> >> > >>> >> >> > > some
> >>> > >> >> > >>> >> >> > > >> work to do -- but a recent patch that seemed
> >>> > harmless
> >>> > >> >> was
> >>> > >> >> > >>> >> causing a
> >>> > >> >> > >>> >> >> > > >> reliable flake fail in the backport to
> >>> branch-1*
> >>> > >> >> > confirmed by
> >>> > >> >> > >>> >> local
> >>> > >> >> > >>> >> >> > > runs.
> >>> > >> >> > >>> >> >> > > >> The flakeyness was plain to see up in
> >>> > >> builds.apache.org
> >>> > >> >> .
> >>> > >> >> > >>> >> >> > > >> + In the last few days I've committed a
> patch
> >>> that
> >>> > >> >> > included
> >>> > >> >> > >>> >> javadoc
> >>> > >> >> > >>> >> >> > > >> warnings even though hadoopqa said the patch
> >>> > >> introduced
> >>> > >> >> > javadoc
> >>> > >> >> > >>> >> >> issues
> >>> > >> >> > >>> >> >> > > (I
> >>> > >> >> > >>> >> >> > > >> missed it). This messed up life for folks
> >>> > >> subsequently
> >>> > >> >> as
> >>> > >> >> > their
> >>> > >> >> > >>> >> >> > patches
> >>> > >> >> > >>> >> >> > > now
> >>> > >> >> > >>> >> >> > > >> reported javadoc issues....
> >>> > >> >> > >>> >> >> > > >>
> >>> > >> >> > >>> >> >> > > >> In short, I suggest that builds.apache.org
> is
> >>> > worth
> >>> > >> >> > keeping an
> >>> > >> >> > >>> >> eye
> >>> > >> >> > >>> >> >> > on,
> >>> > >> >> > >>> >> >> > > >> make
> >>> > >> >> > >>> >> >> > > >> sure you get a clean build out of hadoopqa
> >>> before
> >>> > >> >> > committing
> >>> > >> >> > >>> >> >> anything,
> >>> > >> >> > >>> >> >> > > and
> >>> > >> >> > >>> >> >> > > >> lets all work together to try and keep our
> >>> builds
> >>> > >> blue:
> >>> > >> >> > it'll
> >>> > >> >> > >>> >> save
> >>> > >> >> > >>> >> >> us
> >>> > >> >> > >>> >> >> > > all
> >>> > >> >> > >>> >> >> > > >> work in the long run.
> >>> > >> >> > >>> >> >> > > >>
> >>> > >> >> > >>> >> >> > > >> St.Ack
> >>> > >> >> > >>> >> >> > > >>
> >>> > >> >> > >>> >> >> > > >>
> >>> > >> >> > >>> >> >> > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack <
> >>> > >> stack@duboce.net
> >>> > >> >> >
> >>> > >> >> > wrote:
> >>> > >> >> > >>> >> >> > > >>
> >>> > >> >> > >>> >> >> > > >> > Branch-1 and master have stabilized and
> now
> >>> run
> >>> > >> mostly
> >>> > >> >> > blue
> >>> > >> >> > >>> >> >> (give or
> >>> > >> >> > >>> >> >> > > take
> >>> > >> >> > >>> >> >> > > >> > the odd failure) [1][2]. Having a mostly
> blue
> >>> > >> branch-1
> >>> > >> >> > has
> >>> > >> >> > >>> >> >> helped us
> >>> > >> >> > >>> >> >> > > >> > identify at least one destabilizing
> commit in
> >>> > the
> >>> > >> last
> >>> > >> >> > few
> >>> > >> >> > >>> >> days,
> >>> > >> >> > >>> >> >> > maybe
> >>> > >> >> > >>> >> >> > > >> two;
> >>> > >> >> > >>> >> >> > > >> > this is as it should be (smile).
> >>> > >> >> > >>> >> >> > > >> >
> >>> > >> >> > >>> >> >> > > >> > Lets keep our builds blue. If you commit a
> >>> > patch,
> >>> > >> make
> >>> > >> >> > sure
> >>> > >> >> > >>> >> >> > subsequent
> >>> > >> >> > >>> >> >> > > >> > builds stay blue. You can subscribe to
> >>> > >> >> > builds@hbase.apache.org
> >>> > >> >> > >>> >> >> to
> >>> > >> >> > >>> >> >> > get
> >>> > >> >> > >>> >> >> > > >> > notice of failures if not already
> subscribed.
> >>> > >> >> > >>> >> >> > > >> >
> >>> > >> >> > >>> >> >> > > >> > Thanks,
> >>> > >> >> > >>> >> >> > > >> > St.Ack
> >>> > >> >> > >>> >> >> > > >> >
> >>> > >> >> > >>> >> >> > > >> > 1.
> >>> > >> >> > >>> >> https://builds.apache.org/
> view/H-L/view/HBase/job/HBase-
> >>> > 1.0/
> >>> > >> >> > >>> >> >> > > >> > 2.
> >>> > >> >> > >>> >> >> https://builds.apache.org/view
> >>> /H-L/view/HBase/job/HBase-
> >>> > >> TRUNK/
> >>> > >> >> > >>> >> >> > > >> >
> >>> > >> >> > >>> >> >> > > >> >
> >>> > >> >> > >>> >> >> > > >> > On Mon, Oct 13, 2014 at 4:41 PM, Stack <
> >>> > >> >> > stack@duboce.net>
> >>> > >> >> > >>> >> wrote:
> >>> > >> >> > >>> >> >> > > >> >
> >>> > >> >> > >>> >> >> > > >> >> A few notes on testing.
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >> Too long to read, infra is more capable
> now
> >>> and
> >>> > >> after
> >>> > >> >> > some
> >>> > >> >> > >>> >> >> work, we
> >>> > >> >> > >>> >> >> > > are
> >>> > >> >> > >>> >> >> > > >> >> seeing branch-1 and trunk mostly running
> >>> blue.
> >>> > >> Lets
> >>> > >> >> > try and
> >>> > >> >> > >>> >> >> keep it
> >>> > >> >> > >>> >> >> > > this
> >>> > >> >> > >>> >> >> > > >> >> way going forward.
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >> Apache Infra has new, more capable
> hardware.
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >> A recent spurt of test fixing combined
> with
> >>> > more
> >>> > >> >> > capable
> >>> > >> >> > >>> >> >> hardware
> >>> > >> >> > >>> >> >> > > seems
> >>> > >> >> > >>> >> >> > > >> >> to have gotten us to a new place; tests
> are
> >>> > mostly
> >>> > >> >> > passing now
> >>> > >> >> > >>> >> >> on
> >>> > >> >> > >>> >> >> > > >> branch-1
> >>> > >> >> > >>> >> >> > > >> >> and master.  Lets try and keep it this
> way
> >>> and
> >>> > >> start
> >>> > >> >> > to trust
> >>> > >> >> > >>> >> >> our
> >>> > >> >> > >>> >> >> > > test
> >>> > >> >> > >>> >> >> > > >> runs
> >>> > >> >> > >>> >> >> > > >> >> again.  Just a few flakies remain.  Lets
> try
> >>> > and
> >>> > >> nail
> >>> > >> >> > them.
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >> Our tests now run in parallel with other
> >>> test
> >>> > >> suites
> >>> > >> >> > where
> >>> > >> >> > >>> >> >> previous
> >>> > >> >> > >>> >> >> > > we
> >>> > >> >> > >>> >> >> > > >> >> ran alone. You can see this sometimes
> when
> >>> our
> >>> > >> zombie
> >>> > >> >> > detector
> >>> > >> >> > >>> >> >> > > reports
> >>> > >> >> > >>> >> >> > > >> >> tests from another project altogether as
> >>> > lingerers
> >>> > >> >> (To
> >>> > >> >> > be
> >>> > >> >> > >>> >> >> fixed).
> >>> > >> >> > >>> >> >> > > Some
> >>> > >> >> > >>> >> >> > > >> of
> >>> > >> >> > >>> >> >> > > >> >> our tests are failing because a
> concurrent
> >>> > hbase
> >>> > >> run
> >>> > >> >> is
> >>> > >> >> > >>> >> undoing
> >>> > >> >> > >>> >> >> > > classes
> >>> > >> >> > >>> >> >> > > >> and
> >>> > >> >> > >>> >> >> > > >> >> data from under it. Also, lets fix.
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >> Our tests are brittle. It takes 75minutes
> >>> for
> >>> > >> them to
> >>> > >> >> > >>> >> complete.
> >>> > >> >> > >>> >> >> > Many
> >>> > >> >> > >>> >> >> > > >> are
> >>> > >> >> > >>> >> >> > > >> >> heavy-duty integration tests starting up
> >>> > multiple
> >>> > >> >> > clusters and
> >>> > >> >> > >>> >> >> > > mapreduce
> >>> > >> >> > >>> >> >> > > >> >> all in the one JVM. It is a miracle they
> >>> pass
> >>> > at
> >>> > >> all.
> >>> > >> >> > Usually
> >>> > >> >> > >>> >> >> > > >> integration
> >>> > >> >> > >>> >> >> > > >> >> tests have been cast as unit tests
> because
> >>> > there
> >>> > >> was
> >>> > >> >> > no where
> >>> > >> >> > >>> >> >> else
> >>> > >> >> > >>> >> >> > > for
> >>> > >> >> > >>> >> >> > > >> them
> >>> > >> >> > >>> >> >> > > >> >> to get an airing.  We have the hbase-it
> >>> suite
> >>> > now
> >>> > >> >> > which would
> >>> > >> >> > >>> >> >> be a
> >>> > >> >> > >>> >> >> > > more
> >>> > >> >> > >>> >> >> > > >> apt
> >>> > >> >> > >>> >> >> > > >> >> place but until these are run on a
> regular
> >>> > basis
> >>> > >> in
> >>> > >> >> > public for
> >>> > >> >> > >>> >> >> all
> >>> > >> >> > >>> >> >> > to
> >>> > >> >> > >>> >> >> > > >> see,
> >>> > >> >> > >>> >> >> > > >> >> the fat integration tests disguised as
> unit
> >>> > tests
> >>> > >> >> will
> >>> > >> >> > remain.
> >>> > >> >> > >>> >> >> A
> >>> > >> >> > >>> >> >> > > >> review of
> >>> > >> >> > >>> >> >> > > >> >> our current unit tests weeding the old
> cruft
> >>> > and
> >>> > >> the
> >>> > >> >> > no longer
> >>> > >> >> > >>> >> >> > > relevant
> >>> > >> >> > >>> >> >> > > >> or
> >>> > >> >> > >>> >> >> > > >> >> duplicates would be a nice undertaking if
> >>> > someone
> >>> > >> is
> >>> > >> >> > looking
> >>> > >> >> > >>> >> to
> >>> > >> >> > >>> >> >> > > >> contribute.
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >> Alex Newman has been working on making
> our
> >>> > tests
> >>> > >> work
> >>> > >> >> > up on
> >>> > >> >> > >>> >> >> travis
> >>> > >> >> > >>> >> >> > > and
> >>> > >> >> > >>> >> >> > > >> >> circle-ci.  That'll be sweet when it goes
> >>> > >> end-to-end.
> >>> > >> >> > He also
> >>> > >> >> > >>> >> >> > added
> >>> > >> >> > >>> >> >> > > in
> >>> > >> >> > >>> >> >> > > >> >> some "type" categorizations -- client,
> >>> filter,
> >>> > >> >> > mapreduce --
> >>> > >> >> > >>> >> >> > alongside
> >>> > >> >> > >>> >> >> > > >> our
> >>> > >> >> > >>> >> >> > > >> >> old "sizing" categorizations of
> >>> > >> small/medium/large.
> >>> > >> >> > His
> >>> > >> >> > >>> >> >> thinking
> >>> > >> >> > >>> >> >> > is
> >>> > >> >> > >>> >> >> > > >> that
> >>> > >> >> > >>> >> >> > > >> >> we can run these categorizations in
> parallel
> >>> > so we
> >>> > >> >> > could run
> >>> > >> >> > >>> >> the
> >>> > >> >> > >>> >> >> > > total
> >>> > >> >> > >>> >> >> > > >> >> suite in about the time of the longest
> test,
> >>> > say
> >>> > >> >> > 20-30minutes?
> >>> > >> >> > >>> >> >> We
> >>> > >> >> > >>> >> >> > > could
> >>> > >> >> > >>> >> >> > > >> >> even change Apache to run them this way.
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >> FYI,
> >>> > >> >> > >>> >> >> > > >> >> St.Ack
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >>
> >>> > >> >> > >>> >> >> > > >> >
> >>> > >> >> > >>> >> >> > > >>
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > >
> >>> > >> >> > >>> >> >> > > > --
> >>> > >> >> > >>> >> >> > > > Sean
> >>> > >> >> > >>> >> >> > >
> >>> > >> >> > >>> >> >> >
> >>> > >> >> > >>> >> >> >
> >>> > >> >> > >>> >> >> >
> >>> > >> >> > >>> >> >> > --
> >>> > >> >> > >>> >> >> > busbey
> >>> > >> >> > >>> >> >> >
> >>> > >> >> > >>> >> >>
> >>> > >> >> > >>> >> >
> >>> > >> >> > >>> >> >
> >>> > >> >> > >>> >> >
> >>> > >> >> > >>> >> > --
> >>> > >> >> > >>> >> > busbey
> >>> > >> >> > >>> >> >
> >>> > >> >> > >>> >>
> >>> > >> >> > >>> >>
> >>> > >> >> > >>> >>
> >>> > >> >> > >>> >> --
> >>> > >> >> > >>> >> busbey
> >>> > >> >> > >>> >>
> >>> > >> >> > >>>
> >>> > >> >> > >
> >>> > >> >> > >
> >>> > >> >> > >
> >>> > >> >> > > --
> >>> > >> >> > > busbey
> >>> > >> >> >
> >>> > >> >>
> >>> > >>
> >>> > >>
> >>> > >>
> >>> > >> --
> >>> > >> busbey
> >>> > >>
> >>> > >
> >>> > >
> >>> > >
> >>> > > --
> >>> > > Sean
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > busbey
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Thanks,
> >>> Michael Antonov
> >>>
> >>
> >>
> >>
> >> --
> >>
> >> -- Appy
> >>
> >
> >
> >
> > --
> >
> > -- Appy
>
>
>
> --
> busbey
>



-- 
Thanks,
Michael Antonov

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message