hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@apache.org>
Subject Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)
Date Tue, 07 Nov 2017 14:10:00 GMT
> Should I be able to see the machine dir when I look at nightlies output?
> (Was trying to see what else is running).

Ah. we don't have the same machine sampling on nightly as we do in
precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
repeatedly)  that includes pulling that information gathering into a
place where we could also use it in nightly.

Did we ever figure out how many cores we expect our tests to need? It
looks like the Hadoop nodes have 8 cores. (with 2 executors that means
4 is our fair share)

On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey <busbey@apache.org> wrote:
> surefire results get zipped up (we were filling the jenkins hosts with
> old test logs previously) and stored in a file called "test_logs.zip"
> for each jvm run. So if that happend in the jdk7 run for branch-1.2,
> it'd be in artifacts -> output-jdk7 -> test_logs.zip.
>
> I don't know if the archival process grabs things from surefire that
> aren't the surefire XML files, but we can update it to do so if it
> doesn't.
>
> On Mon, Nov 6, 2017 at 11:39 PM, Stack <stack@duboce.net> wrote:
>> I see this in the 1.2 nightly just when it gives up the ghost....
>>
>> [WARNING] Corrupted STDOUT by directly writing to native stream in
>> forked JVM 2. See FAQ web page and the dump file
>> /testptch/hbase/hbase-server/target/surefire-reports/2017-11-06T20-11-30_219-jvmRun2.dumpstream
>>
>> .. but the pointed to dumpstream doesn't seem to be around post build.
>> I am looking in wrong place?
>>
>>
>> Thanks,
>>
>> S
>>
>>
>> On Mon, Nov 6, 2017 at 8:20 PM, Stack <stack@duboce.net> wrote:
>>
>>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey <sean.busbey@gmail.com> wrote:
>>>
>>>> Given that all of the old post-commit tests have been posting that
>>>> they're failing to JIRAs for what looks like a month, is there any
>>>> reason not to switch to the new tests that also say they're failing?
>>>>
>>>>
>>> No reason.
>>>
>>>
>>>
>>>> The reason HBASE-18467 has been sitting on hold this whole time has
>>>> been because the new nightly branch tests keep complaining about
>>>> failures.
>>>>
>>>>
>>> Looking just now, it looks like killed-off test runs.
>>>
>>> +1 on move to nightlies.
>>>
>>> Can I help?
>>>
>>> Should I be able to see the machine dir when I look at nightlies output?
>>> (Was trying to see what else is running).
>>>
>>> Thanks Sean,
>>> St.Ack
>>>
>>>
>>>
>>>
>>>
>>>
>>>> On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey <sean.busbey@gmail.com>
>>>> wrote:
>>>> > It looks like old tests branch-1.2 and branch-1.3 are failing with
>>>> > some maven enforcer problem that we thought we had fixed a few times
>>>> > before. It's probably fixable by changing the version of maven they
>>>> > use, but I'd much rather any test effort go into the last mile of
>>>> > getting our new nightly tests working.
>>>> >
>>>> > I'll start picking this up as soon as I close out HBASE-18784.
>>>> >
>>>> > Please consider branch-1.2 release blocked. :(
>>>> >
>>>> > On Mon, Nov 6, 2017 at 10:19 AM, Stack <stack@duboce.net> wrote:
>>>> >> Our builds seem pretty sick up on builds.apache.org even after the
>>>> miracle
>>>> >> work by Allen W containing errant hadoop processes. Looking at 1.2
and
>>>> 1.3,
>>>> >> we don't even get off the ground. Anyone been taking a look?
>>>> >>
>>>> >> When I try to run the branch-1.2 and branch-1.3 unit tests locally,
>>>> about
>>>> >> ten tests or so timeout. Have others tried branch-1 test runs recently?
>>>> >>
>>>> >> Thanks,
>>>> >> S
>>>> >>
>>>> >>
>>>> >> On Mon, Aug 21, 2017 at 1:54 PM, Stack <stack@duboce.net>
wrote:
>>>> >>
>>>> >>> Loads of tests timing out in test runs -- then they all pass.
Anyone
>>>> have
>>>> >>> an input? I'm trying to take a look as background task...
>>>> >>>
>>>> >>> S
>>>> >>>
>>>> >>> On Tue, Jul 11, 2017 at 7:05 PM, Stack <stack@duboce.net>
wrote:
>>>> >>>
>>>> >>>> Thanks Appy.
>>>> >>>>
>>>> >>>> Any one looking at the 'ERROR ExecutionException Java heap
space...'
>>>> >>>> errors on patch builds or failed forking? Seems common enough.
Here
>>>> are
>>>> >>>> complaints that remote JVM went away:
>>>> >>>>
>>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>>> >>>> HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-server.txt
>>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>>> >>>> HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-server.txt
>>>> >>>>
>>>> >>>> Then this succeeds....
>>>> >>>>
>>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>>> >>>> HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-server.txt
>>>> >>>>
>>>> >>>> And we are good for a while.
>>>> >>>>
>>>> >>>> Then heap issues:
>>>> >>>>
>>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>>> >>>> HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-server.txt
>>>> >>>>
>>>> >>>> Are the zombies back?
>>>> >>>>
>>>> >>>> St.Ack
>>>> >>>>
>>>> >>>> On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma <appy@cloudera.com
>>>> >
>>>> >>>> wrote:
>>>> >>>>
>>>> >>>>> Fixed 'trends' in flaky dashboard. Since i changed the
test names
>>>> in last
>>>> >>>>> fix, the dots in the name were messing up with CSS selectors.
:)
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma <
>>>> appy@cloudera.com>
>>>> >>>>> wrote:
>>>> >>>>>
>>>> >>>>> > Quick update on flaky dashboard:
>>>> >>>>> > Flaky dashboard wasn't working earlier because
our trunk build was
>>>> >>>>> broken.
>>>> >>>>> > After trunk was fixed, the format of log lines
in consoleText was
>>>> not
>>>> >>>>> the
>>>> >>>>> > same, so findHangingTests.py was not able to parse
it correctly
>>>> for
>>>> >>>>> > broken/hanging/timeout tests. That's been fixed
now HBASE-18341
>>>> >>>>> > <https://issues.apache.org/jira/browse/HBASE-18341>.
>>>> >>>>> > Drob brought up in other thread that 'treads' isn't
working. It's
>>>> >>>>> probably
>>>> >>>>> > because i changed tests names (which are used as
keys in python
>>>> dicts)
>>>> >>>>> from
>>>> >>>>> > just class name to package name+classname (without
common
>>>> >>>>> > org.apache.hadoop.hbase prefix). I had to do it
because we have
>>>> some
>>>> >>>>> tests
>>>> >>>>> > with same class name but in different packages.
>>>> >>>>> >
>>>> >>>>> > I'll take a look at it sometime this week (unless
someone wants to
>>>> >>>>> take it
>>>> >>>>> > up and work on this beautiful piece of infra ;)
)
>>>> >>>>> >
>>>> >>>>> >
>>>> >>>>> > On Thu, Jul 6, 2017 at 11:25 PM, Stack <stack@duboce.net>
wrote:
>>>> >>>>> >
>>>> >>>>> >> On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey
<busbey@apache.org>
>>>> >>>>> wrote:
>>>> >>>>> >>
>>>> >>>>> >> > that sounds like our project structure
is broken. Please make
>>>> sure
>>>> >>>>> >> there's
>>>> >>>>> >> > a jira that tracks it and I'll take a
look later.
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >>
>>>> >>>>> >> Filed HBASE-18331 for now.
>>>> >>>>> >>
>>>> >>>>> >> I can take a look too later.
>>>> >>>>> >>
>>>> >>>>> >> St.Ack
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>> >> > On Thu, Jul 6, 2017 at 6:15 PM, Stack
<stack@duboce.net>
>>>> wrote:
>>>> >>>>> >> >
>>>> >>>>> >> > > I tried publishing hbase-3.0.0-SNAPSHOT...
so
>>>> hbase-checkstyle
>>>> >>>>> was up
>>>> >>>>> >> in
>>>> >>>>> >> > > repo (presuming it relied on an aged-out
snapshot). Seems to
>>>> have
>>>> >>>>> >> 'fixed'
>>>> >>>>> >> > > it for now....
>>>> >>>>> >> > >
>>>> >>>>> >> > > St.Ack
>>>> >>>>> >> > >
>>>> >>>>> >> > > On Thu, Jul 6, 2017 at 12:50 PM,
Stack <stack@duboce.net>
>>>> wrote:
>>>> >>>>> >> > >
>>>> >>>>> >> > > > The 3.0.0-SNAPSHOT looks suspicious
... the hbase
>>>> version....
>>>> >>>>> >> > > > St.Ack
>>>> >>>>> >> > > >
>>>> >>>>> >> > > > On Thu, Jul 6, 2017 at 12:49
PM, Stack <stack@duboce.net>
>>>> >>>>> wrote:
>>>> >>>>> >> > > >
>>>> >>>>> >> > > >> On Thu, Jul 6, 2017 at 12:48
PM, Stack <stack@duboce.net>
>>>> >>>>> wrote:
>>>> >>>>> >> > > >>
>>>> >>>>> >> > > >>> Checkstyle is currently
broke on our builds... looking.
>>>> >>>>> >> > > >>> St.Ack
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >> Works if I run it locally
(of course)
>>>> >>>>> >> > > >> St.Ack
>>>> >>>>> >> > > >>
>>>> >>>>> >> > > >>
>>>> >>>>> >> > > >>
>>>> >>>>> >> > > >>
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >>> [ERROR] Failed to execute
goal org.apache.maven.plugins:
>>>> >>>>> >> > > maven-checkstyle-plugin:2.17:checkstyle
(default-cli) on
>>>> project
>>>> >>>>> >> hbase:
>>>> >>>>> >> > > Execution default-cli of goal org.apache.maven.plugins:
>>>> >>>>> >> > > maven-checkstyle-plugin:2.17:checkstyle
failed: Plugin
>>>> >>>>> >> > > org.apache.maven.plugins:maven-checkstyle-plugin:2.17
or
>>>> one of
>>>> >>>>> its
>>>> >>>>> >> > > dependencies could not be resolved:
Could not find artifact
>>>> >>>>> >> > > org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT
in
>>>> Nexus (
>>>> >>>>> >> > > http://repository.apache.org/snapshots)
-> [Help 1][ERROR]
>>>> >>>>> [ERROR] To
>>>> >>>>> >> > see
>>>> >>>>> >> > > the full stack trace of the errors,
re-run Maven with the -e
>>>> >>>>> >> > switch.[ERROR]
>>>> >>>>> >> > > Re-run Maven using the -X switch
to enable full debug
>>>> >>>>> logging.[ERROR]
>>>> >>>>> >> > > [ERROR] For more information about
the errors and possible
>>>> >>>>> solutions,
>>>> >>>>> >> > > please read the following articles:[ERROR]
[Help 1]
>>>> >>>>> >> > > http://cwiki.apache.org/confluence/display/MAVEN/
>>>> >>>>> >> > > PluginResolutionExceptionBuild step
'Invoke top-level Maven
>>>> >>>>> targets'
>>>> >>>>> >> > > marked build as failure
>>>> >>>>> >> > > >>> Performing Post build
task...
>>>> >>>>> >> > > >>> Match found for :.*
: True
>>>> >>>>> >> > > >>> Logical operation result
is TRUE
>>>> >>>>> >> > > >>> Running script  : #
Run zombie detector script
>>>> >>>>> >> > > >>> ./dev-support/zombie-detector.sh
--jenkins ${BUILD_ID}
>>>> >>>>> >> > > >>> [a3159d73] $ /bin/bash
-xe /tmp/hudson1697041977582083402
>>>> .sh
>>>> >>>>> >> > > >>> + ./dev-support/zombie-detector.sh
--jenkins 3320
>>>> >>>>> >> > > >>> Thu Jul  6 01:37:09
UTC 2017 We're ok: there is no
>>>> zombie test
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >>> On Fri, Jun 30, 2017
at 2:43 PM, Sean Busbey <
>>>> >>>>> busbey@apache.org>
>>>> >>>>> >> > > wrote:
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >>>> jacoco was added
ages ago. I'd guess that something
>>>> changed
>>>> >>>>> on
>>>> >>>>> >> the
>>>> >>>>> >> > > >>>> machines
>>>> >>>>> >> > > >>>> we use to cause
it to stop working.
>>>> >>>>> >> > > >>>>
>>>> >>>>> >> > > >>>> On Thu, Jun 29,
2017 at 12:02 PM, Stack <
>>>> stack@duboce.net>
>>>> >>>>> >> wrote:
>>>> >>>>> >> > > >>>>
>>>> >>>>> >> > > >>>> > On Wed, Jun
28, 2017 at 8:43 AM, Josh Elser <
>>>> >>>>> elserj@apache.org
>>>> >>>>> >> >
>>>> >>>>> >> > > >>>> wrote:
>>>> >>>>> >> > > >>>> >
>>>> >>>>> >> > > >>>> > >
>>>> >>>>> >> > > >>>> > >
>>>> >>>>> >> > > >>>> > > On 6/27/17
7:20 PM, Stack wrote:
>>>> >>>>> >> > > >>>> > >
>>>> >>>>> >> > > >>>> > >> *
test-patch's whitespace plugin can configured to
>>>> >>>>> ignore
>>>> >>>>> >> some
>>>> >>>>> >> > > >>>> files
>>>> >>>>> >> > > >>>> > (but
>>>> >>>>> >> > > >>>> > >>>
I
>>>> >>>>> >> > > >>>> > >>>
can't think of any we'd care to so whitelist)
>>>> >>>>> >> > > >>>> > >>>
>>>> >>>>> >> > > >>>> > >>>
Generated files.
>>>> >>>>> >> > > >>>> > >>
>>>> >>>>> >> > > >>>> > >
>>>> >>>>> >> > > >>>> > > Oh my
goodness, yes, please. This has been such a
>>>> pain
>>>> >>>>> in the
>>>> >>>>> >> > rear
>>>> >>>>> >> > > >>>> for me
>>>> >>>>> >> > > >>>> > > as I've
been rebasing space quota patches.
>>>> Sometimes, the
>>>> >>>>> >> spaces
>>>> >>>>> >> > > in
>>>> >>>>> >> > > >>>> > > pb-gen'ed
code are removed by folks before commit,
>>>> other
>>>> >>>>> >> times
>>>> >>>>> >> > > they
>>>> >>>>> >> > > >>>> > aren't.
>>>> >>>>> >> > > >>>> > >
>>>> >>>>> >> > > >>>> >
>>>> >>>>> >> > > >>>> > Agree sir.
Its a distraction at least.
>>>> >>>>> >> > > >>>> >
>>>> >>>>> >> > > >>>> > I see Jacoco
report here now:
>>>> >>>>> >> > > >>>> > https://builds.apache.org/job/
>>>> HBase-Trunk_matrix/jdk=JDK%
>>>> >>>>> >> > > >>>> > 201.8%20(latest),label=Hadoop/3277/
>>>> >>>>> >> > > >>>> >
>>>> >>>>> >> > > >>>> > Maybe it has
been there always and I just haven't
>>>> noticed.
>>>> >>>>> >> > > >>>> >
>>>> >>>>> >> > > >>>> > Its all 0%.
We need to turn on stuff?
>>>> >>>>> >> > > >>>> >
>>>> >>>>> >> > > >>>> > St.Ack
>>>> >>>>> >> > > >>>> >
>>>> >>>>> >> > > >>>>
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >>>
>>>> >>>>> >> > > >>
>>>> >>>>> >> > > >
>>>> >>>>> >> > >
>>>> >>>>> >> >
>>>> >>>>> >>
>>>> >>>>> >
>>>> >>>>> >
>>>> >>>>> >
>>>> >>>>> > --
>>>> >>>>> >
>>>> >>>>> > -- Appy
>>>> >>>>> >
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>>
>>>> >>>>> -- Appy
>>>> >>>>>
>>>> >>>>
>>>> >>>>
>>>> >>>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Sean
>>>>
>>>>
>>>>
>>>> --
>>>> Sean
>>>>
>>>
>>>

Mime
View raw message