www-builds mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@cloudera.com>
Subject Re: Builds that have been failing for a while
Date Mon, 20 Sep 2010 18:43:52 GMT
Hi. Improving resource use is a great goal, I'm not sure it's that
clearcut though. I'm only familiar with ZK: note that these two jobs
are our patch queues, which only gets run when a user submits a patch
to a jira (only a few patches on each job over the last couple
months):
Zookeeper-Patch-h1.grid.sp2.yahoo.net              | 2 mo 2 days
Zookeeper-Patch-h7.grid.sp2.yahoo.net              | 1 mo 16 days
this may fail for any number of reasons (patch won't apply, no tests,
findbug issues, etc...) Also notice that a patch gets sent to only 1
of 3 possible machines in some pseudo random fashion. So while one
patch job shows a recent success, the others do not. So to some extent
this is out of our hands.

We also see frequent failures from things that seem like infrastruture
issues, here's there console output from a couple recent failures:
WARNING: clock of the subversion server appears to be out of sync.
This can result in inconsistent check out behavior.

here's another:
Checking out http://svn.apache.org/repos/asf/hadoop/zookeeper/trunk
ERROR: Failed to check out
http://svn.apache.org/repos/asf/hadoop/zookeeper/trunk
org.tmatesoft.svn.core.SVNException: svn: unknown host
svn: OPTIONS request failed on '/repos/asf/hadoop/zookeeper/trunk'

that said, we recently had issues with our trunk that were causing
intermittent failures. We've been working on those and hopefully it
will help to clear these patch issues.

Patrick


On Mon, Sep 20, 2010 at 7:22 AM, Niklas Gustavsson <niklas@protocol7.com> wrote:
>
> On Mon, Sep 20, 2010 at 2:10 PM, Kristian Waagan
> <kristian.waagan@oracle.com> wrote:
> > Just want to point out that not all the jobs on the list have been running
> > regularly, so they haven't been using resources.
> > In any case, disabling jobs that haven't been run for a long time is
> > probably ok too.
> > One could consider removing "dead jobs", but I want to keep the Derby job a
> > little longer ;)
> > (is failing because Clover is/was unable to handle the data volume, but
> > hasn't been run for nine months)
>
> Right, if the last build was successful, the job will not be disabled,
> even if the last build was older than one month. However, if your last
> job was unsuccessful and no one has fixed that for more than a month,
> I think disabling is probably appropriate. In your case, simply enable
> the job again when you fixed the problem with Clover. If you haven't
> used the job for nine months, unchecking a checkbox is probably not
> that much work when you're ready to run again :-)
>
> /niklas

Mime
View raw message