hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)
Date Wed, 29 Nov 2017 16:06:15 GMT
Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a good
while now. In their place, refer to an ongoing Sean "Nightly" project, an
effort he has been at for a while. It does more checking with pretty
reports that will help figuring general stability over time. See under
https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/ See the
nightly builds for 1.2 and 1.3. They have some teething issues still but
are almost there. See the 1.2 build from last night. In recent days, the
1.2 branch went from trash-can fire to stable. See how all tests passed in
the last build but then we failed generating the src bundle on the end
(this is what I mean by 'teething' issue). Will work on fixing this last
step and moving over 1.4, etc., in the next few days.

FYI,
St.Ack


On Tue, Nov 7, 2017 at 7:45 AM, Stack <stack@duboce.net> wrote:

> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey <busbey@apache.org> wrote:
>
>> > Should I be able to see the machine dir when I look at nightlies output?
>> > (Was trying to see what else is running).
>>
>> Ah. we don't have the same machine sampling on nightly as we do in
>> precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
>> repeatedly)  that includes pulling that information gathering into a
>> place where we could also use it in nightly.
>>
>>
> Sweet.
>
>
>
>> Did we ever figure out how many cores we expect our tests to need? It
>> looks like the Hadoop nodes have 8 cores. (with 2 executors that means
>> 4 is our fair share)
>>
>>
> At the end of the thread inquiry I suggested that we don't use enough
> cores, that we could up our fork counts and tests would complete in less
> time. I wanted to experiment some w/ high fork counts -- 16 or so -- to see
> if concurrent running brought on  more failure.
>
> St.Ack
>
>
>
>
>> On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey <busbey@apache.org> wrote:
>> > surefire results get zipped up (we were filling the jenkins hosts with
>> > old test logs previously) and stored in a file called "test_logs.zip"
>> > for each jvm run. So if that happend in the jdk7 run for branch-1.2,
>> > it'd be in artifacts -> output-jdk7 -> test_logs.zip.
>> >
>> > I don't know if the archival process grabs things from surefire that
>> > aren't the surefire XML files, but we can update it to do so if it
>> > doesn't.
>> >
>> > On Mon, Nov 6, 2017 at 11:39 PM, Stack <stack@duboce.net> wrote:
>> >> I see this in the 1.2 nightly just when it gives up the ghost....
>> >>
>> >> [WARNING] Corrupted STDOUT by directly writing to native stream in
>> >> forked JVM 2. See FAQ web page and the dump file
>> >> /testptch/hbase/hbase-server/target/surefire-reports/2017-11
>> -06T20-11-30_219-jvmRun2.dumpstream
>> >>
>> >> .. but the pointed to dumpstream doesn't seem to be around post build.
>> >> I am looking in wrong place?
>> >>
>> >>
>> >> Thanks,
>> >>
>> >> S
>> >>
>> >>
>> >> On Mon, Nov 6, 2017 at 8:20 PM, Stack <stack@duboce.net> wrote:
>> >>
>> >>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey <sean.busbey@gmail.com>
>> wrote:
>> >>>
>> >>>> Given that all of the old post-commit tests have been posting that
>> >>>> they're failing to JIRAs for what looks like a month, is there any
>> >>>> reason not to switch to the new tests that also say they're failing?
>> >>>>
>> >>>>
>> >>> No reason.
>> >>>
>> >>>
>> >>>
>> >>>> The reason HBASE-18467 has been sitting on hold this whole time
has
>> >>>> been because the new nightly branch tests keep complaining about
>> >>>> failures.
>> >>>>
>> >>>>
>> >>> Looking just now, it looks like killed-off test runs.
>> >>>
>> >>> +1 on move to nightlies.
>> >>>
>> >>> Can I help?
>> >>>
>> >>> Should I be able to see the machine dir when I look at nightlies
>> output?
>> >>> (Was trying to see what else is running).
>> >>>
>> >>> Thanks Sean,
>> >>> St.Ack
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>> On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey <sean.busbey@gmail.com>
>> >>>> wrote:
>> >>>> > It looks like old tests branch-1.2 and branch-1.3 are failing
with
>> >>>> > some maven enforcer problem that we thought we had fixed a
few
>> times
>> >>>> > before. It's probably fixable by changing the version of maven
they
>> >>>> > use, but I'd much rather any test effort go into the last mile
of
>> >>>> > getting our new nightly tests working.
>> >>>> >
>> >>>> > I'll start picking this up as soon as I close out HBASE-18784.
>> >>>> >
>> >>>> > Please consider branch-1.2 release blocked. :(
>> >>>> >
>> >>>> > On Mon, Nov 6, 2017 at 10:19 AM, Stack <stack@duboce.net>
wrote:
>> >>>> >> Our builds seem pretty sick up on builds.apache.org even
after
>> the
>> >>>> miracle
>> >>>> >> work by Allen W containing errant hadoop processes. Looking
at
>> 1.2 and
>> >>>> 1.3,
>> >>>> >> we don't even get off the ground. Anyone been taking a
look?
>> >>>> >>
>> >>>> >> When I try to run the branch-1.2 and branch-1.3 unit tests
>> locally,
>> >>>> about
>> >>>> >> ten tests or so timeout. Have others tried branch-1 test
runs
>> recently?
>> >>>> >>
>> >>>> >> Thanks,
>> >>>> >> S
>> >>>> >>
>> >>>> >>
>> >>>> >> On Mon, Aug 21, 2017 at 1:54 PM, Stack <stack@duboce.net>
wrote:
>> >>>> >>
>> >>>> >>> Loads of tests timing out in test runs -- then they
all pass.
>> Anyone
>> >>>> have
>> >>>> >>> an input? I'm trying to take a look as background task...
>> >>>> >>>
>> >>>> >>> S
>> >>>> >>>
>> >>>> >>> On Tue, Jul 11, 2017 at 7:05 PM, Stack <stack@duboce.net>
wrote:
>> >>>> >>>
>> >>>> >>>> Thanks Appy.
>> >>>> >>>>
>> >>>> >>>> Any one looking at the 'ERROR ExecutionException
Java heap
>> space...'
>> >>>> >>>> errors on patch builds or failed forking? Seems
common enough.
>> Here
>> >>>> are
>> >>>> >>>> complaints that remote JVM went away:
>> >>>> >>>>
>> >>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>> >>>> >>>> HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-serv
>> er.txt
>> >>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>> >>>> >>>> HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-serv
>> er.txt
>> >>>> >>>>
>> >>>> >>>> Then this succeeds....
>> >>>> >>>>
>> >>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>> >>>> >>>> HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-serv
>> er.txt
>> >>>> >>>>
>> >>>> >>>> And we are good for a while.
>> >>>> >>>>
>> >>>> >>>> Then heap issues:
>> >>>> >>>>
>> >>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>> >>>> >>>> HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-serv
>> er.txt
>> >>>> >>>>
>> >>>> >>>> Are the zombies back?
>> >>>> >>>>
>> >>>> >>>> St.Ack
>> >>>> >>>>
>> >>>> >>>> On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma
<
>> appy@cloudera.com
>> >>>> >
>> >>>> >>>> wrote:
>> >>>> >>>>
>> >>>> >>>>> Fixed 'trends' in flaky dashboard. Since i
changed the test
>> names
>> >>>> in last
>> >>>> >>>>> fix, the dots in the name were messing up with
CSS selectors.
>> :)
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>> >>>>> On Mon, Jul 10, 2017 at 11:34 AM, Apekshit
Sharma <
>> >>>> appy@cloudera.com>
>> >>>> >>>>> wrote:
>> >>>> >>>>>
>> >>>> >>>>> > Quick update on flaky dashboard:
>> >>>> >>>>> > Flaky dashboard wasn't working earlier
because our trunk
>> build was
>> >>>> >>>>> broken.
>> >>>> >>>>> > After trunk was fixed, the format of log
lines in
>> consoleText was
>> >>>> not
>> >>>> >>>>> the
>> >>>> >>>>> > same, so findHangingTests.py was not able
to parse it
>> correctly
>> >>>> for
>> >>>> >>>>> > broken/hanging/timeout tests. That's been
fixed now
>> HBASE-18341
>> >>>> >>>>> > <https://issues.apache.org/jira/browse/HBASE-18341>.
>> >>>> >>>>> > Drob brought up in other thread that 'treads'
isn't working.
>> It's
>> >>>> >>>>> probably
>> >>>> >>>>> > because i changed tests names (which are
used as keys in
>> python
>> >>>> dicts)
>> >>>> >>>>> from
>> >>>> >>>>> > just class name to package name+classname
(without common
>> >>>> >>>>> > org.apache.hadoop.hbase prefix). I had
to do it because we
>> have
>> >>>> some
>> >>>> >>>>> tests
>> >>>> >>>>> > with same class name but in different
packages.
>> >>>> >>>>> >
>> >>>> >>>>> > I'll take a look at it sometime this week
(unless someone
>> wants to
>> >>>> >>>>> take it
>> >>>> >>>>> > up and work on this beautiful piece of
infra ;) )
>> >>>> >>>>> >
>> >>>> >>>>> >
>> >>>> >>>>> > On Thu, Jul 6, 2017 at 11:25 PM, Stack
<stack@duboce.net>
>> wrote:
>> >>>> >>>>> >
>> >>>> >>>>> >> On Thu, Jul 6, 2017 at 3:45 PM, Sean
Busbey <
>> busbey@apache.org>
>> >>>> >>>>> wrote:
>> >>>> >>>>> >>
>> >>>> >>>>> >> > that sounds like our project
structure is broken. Please
>> make
>> >>>> sure
>> >>>> >>>>> >> there's
>> >>>> >>>>> >> > a jira that tracks it and I'll
take a look later.
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >
>> >>>> >>>>> >>
>> >>>> >>>>> >> Filed HBASE-18331 for now.
>> >>>> >>>>> >>
>> >>>> >>>>> >> I can take a look too later.
>> >>>> >>>>> >>
>> >>>> >>>>> >> St.Ack
>> >>>> >>>>> >>
>> >>>> >>>>> >>
>> >>>> >>>>> >>
>> >>>> >>>>> >> > On Thu, Jul 6, 2017 at 6:15 PM,
Stack <stack@duboce.net>
>> >>>> wrote:
>> >>>> >>>>> >> >
>> >>>> >>>>> >> > > I tried publishing hbase-3.0.0-SNAPSHOT...
so
>> >>>> hbase-checkstyle
>> >>>> >>>>> was up
>> >>>> >>>>> >> in
>> >>>> >>>>> >> > > repo (presuming it relied
on an aged-out snapshot).
>> Seems to
>> >>>> have
>> >>>> >>>>> >> 'fixed'
>> >>>> >>>>> >> > > it for now....
>> >>>> >>>>> >> > >
>> >>>> >>>>> >> > > St.Ack
>> >>>> >>>>> >> > >
>> >>>> >>>>> >> > > On Thu, Jul 6, 2017 at 12:50
PM, Stack <
>> stack@duboce.net>
>> >>>> wrote:
>> >>>> >>>>> >> > >
>> >>>> >>>>> >> > > > The 3.0.0-SNAPSHOT
looks suspicious ... the hbase
>> >>>> version....
>> >>>> >>>>> >> > > > St.Ack
>> >>>> >>>>> >> > > >
>> >>>> >>>>> >> > > > On Thu, Jul 6, 2017
at 12:49 PM, Stack <
>> stack@duboce.net>
>> >>>> >>>>> wrote:
>> >>>> >>>>> >> > > >
>> >>>> >>>>> >> > > >> On Thu, Jul 6,
2017 at 12:48 PM, Stack <
>> stack@duboce.net>
>> >>>> >>>>> wrote:
>> >>>> >>>>> >> > > >>
>> >>>> >>>>> >> > > >>> Checkstyle
is currently broke on our builds...
>> looking.
>> >>>> >>>>> >> > > >>> St.Ack
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >> Works if I run
it locally (of course)
>> >>>> >>>>> >> > > >> St.Ack
>> >>>> >>>>> >> > > >>
>> >>>> >>>>> >> > > >>
>> >>>> >>>>> >> > > >>
>> >>>> >>>>> >> > > >>
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >>> [ERROR] Failed
to execute goal
>> org.apache.maven.plugins:
>> >>>> >>>>> >> > > maven-checkstyle-plugin:2.17:checkstyle
(default-cli)
>> on
>> >>>> project
>> >>>> >>>>> >> hbase:
>> >>>> >>>>> >> > > Execution default-cli of
goal org.apache.maven.plugins:
>> >>>> >>>>> >> > > maven-checkstyle-plugin:2.17:checkstyle
failed: Plugin
>> >>>> >>>>> >> > > org.apache.maven.plugins:maven-checkstyle-plugin:2.17
>> or
>> >>>> one of
>> >>>> >>>>> its
>> >>>> >>>>> >> > > dependencies could not be
resolved: Could not find
>> artifact
>> >>>> >>>>> >> > > org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT
in
>> >>>> Nexus (
>> >>>> >>>>> >> > > http://repository.apache.org/snapshots)
-> [Help
>> 1][ERROR]
>> >>>> >>>>> [ERROR] To
>> >>>> >>>>> >> > see
>> >>>> >>>>> >> > > the full stack trace of
the errors, re-run Maven with
>> the -e
>> >>>> >>>>> >> > switch.[ERROR]
>> >>>> >>>>> >> > > Re-run Maven using the -X
switch to enable full debug
>> >>>> >>>>> logging.[ERROR]
>> >>>> >>>>> >> > > [ERROR] For more information
about the errors and
>> possible
>> >>>> >>>>> solutions,
>> >>>> >>>>> >> > > please read the following
articles:[ERROR] [Help 1]
>> >>>> >>>>> >> > > http://cwiki.apache.org/confluence/display/MAVEN/
>> >>>> >>>>> >> > > PluginResolutionExceptionBuild
step 'Invoke top-level
>> Maven
>> >>>> >>>>> targets'
>> >>>> >>>>> >> > > marked build as failure
>> >>>> >>>>> >> > > >>> Performing
Post build task...
>> >>>> >>>>> >> > > >>> Match found
for :.* : True
>> >>>> >>>>> >> > > >>> Logical operation
result is TRUE
>> >>>> >>>>> >> > > >>> Running script
 : # Run zombie detector script
>> >>>> >>>>> >> > > >>> ./dev-support/zombie-detector.sh
--jenkins
>> ${BUILD_ID}
>> >>>> >>>>> >> > > >>> [a3159d73]
$ /bin/bash -xe
>> /tmp/hudson1697041977582083402
>> >>>> .sh
>> >>>> >>>>> >> > > >>> + ./dev-support/zombie-detector.sh
--jenkins 3320
>> >>>> >>>>> >> > > >>> Thu Jul  6
01:37:09 UTC 2017 We're ok: there is no
>> >>>> zombie test
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >>> On Fri, Jun
30, 2017 at 2:43 PM, Sean Busbey <
>> >>>> >>>>> busbey@apache.org>
>> >>>> >>>>> >> > > wrote:
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >>>> jacoco
was added ages ago. I'd guess that something
>> >>>> changed
>> >>>> >>>>> on
>> >>>> >>>>> >> the
>> >>>> >>>>> >> > > >>>> machines
>> >>>> >>>>> >> > > >>>> we use
to cause it to stop working.
>> >>>> >>>>> >> > > >>>>
>> >>>> >>>>> >> > > >>>> On Thu,
Jun 29, 2017 at 12:02 PM, Stack <
>> >>>> stack@duboce.net>
>> >>>> >>>>> >> wrote:
>> >>>> >>>>> >> > > >>>>
>> >>>> >>>>> >> > > >>>> > On
Wed, Jun 28, 2017 at 8:43 AM, Josh Elser <
>> >>>> >>>>> elserj@apache.org
>> >>>> >>>>> >> >
>> >>>> >>>>> >> > > >>>> wrote:
>> >>>> >>>>> >> > > >>>> >
>> >>>> >>>>> >> > > >>>> > >
>> >>>> >>>>> >> > > >>>> > >
>> >>>> >>>>> >> > > >>>> > >
On 6/27/17 7:20 PM, Stack wrote:
>> >>>> >>>>> >> > > >>>> > >
>> >>>> >>>>> >> > > >>>> > >>
* test-patch's whitespace plugin can
>> configured to
>> >>>> >>>>> ignore
>> >>>> >>>>> >> some
>> >>>> >>>>> >> > > >>>> files
>> >>>> >>>>> >> > > >>>> > (but
>> >>>> >>>>> >> > > >>>> > >>>
I
>> >>>> >>>>> >> > > >>>> > >>>
can't think of any we'd care to so whitelist)
>> >>>> >>>>> >> > > >>>> > >>>
>> >>>> >>>>> >> > > >>>> > >>>
Generated files.
>> >>>> >>>>> >> > > >>>> > >>
>> >>>> >>>>> >> > > >>>> > >
>> >>>> >>>>> >> > > >>>> > >
Oh my goodness, yes, please. This has been
>> such a
>> >>>> pain
>> >>>> >>>>> in the
>> >>>> >>>>> >> > rear
>> >>>> >>>>> >> > > >>>> for me
>> >>>> >>>>> >> > > >>>> > >
as I've been rebasing space quota patches.
>> >>>> Sometimes, the
>> >>>> >>>>> >> spaces
>> >>>> >>>>> >> > > in
>> >>>> >>>>> >> > > >>>> > >
pb-gen'ed code are removed by folks before
>> commit,
>> >>>> other
>> >>>> >>>>> >> times
>> >>>> >>>>> >> > > they
>> >>>> >>>>> >> > > >>>> > aren't.
>> >>>> >>>>> >> > > >>>> > >
>> >>>> >>>>> >> > > >>>> >
>> >>>> >>>>> >> > > >>>> > Agree
sir. Its a distraction at least.
>> >>>> >>>>> >> > > >>>> >
>> >>>> >>>>> >> > > >>>> > I
see Jacoco report here now:
>> >>>> >>>>> >> > > >>>> > https://builds.apache.org/job/
>> >>>> HBase-Trunk_matrix/jdk=JDK%
>> >>>> >>>>> >> > > >>>> > 201.8%20(latest),label=Hadoop/3277/
>> >>>> >>>>> >> > > >>>> >
>> >>>> >>>>> >> > > >>>> > Maybe
it has been there always and I just haven't
>> >>>> noticed.
>> >>>> >>>>> >> > > >>>> >
>> >>>> >>>>> >> > > >>>> > Its
all 0%. We need to turn on stuff?
>> >>>> >>>>> >> > > >>>> >
>> >>>> >>>>> >> > > >>>> > St.Ack
>> >>>> >>>>> >> > > >>>> >
>> >>>> >>>>> >> > > >>>>
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >>>
>> >>>> >>>>> >> > > >>
>> >>>> >>>>> >> > > >
>> >>>> >>>>> >> > >
>> >>>> >>>>> >> >
>> >>>> >>>>> >>
>> >>>> >>>>> >
>> >>>> >>>>> >
>> >>>> >>>>> >
>> >>>> >>>>> > --
>> >>>> >>>>> >
>> >>>> >>>>> > -- Appy
>> >>>> >>>>> >
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>> >>>>> --
>> >>>> >>>>>
>> >>>> >>>>> -- Appy
>> >>>> >>>>>
>> >>>> >>>>
>> >>>> >>>>
>> >>>> >>>
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > --
>> >>>> > Sean
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Sean
>> >>>>
>> >>>
>> >>>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message