hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin P. McCabe" <cmcc...@apache.org>
Subject Re: Jenkins stability and patching
Date Mon, 23 Nov 2015 21:57:20 GMT
On Mon, Nov 23, 2015 at 1:53 PM, Colin P. McCabe <cmccabe@apache.org> wrote:
> I agree that our tests are in a bad state.  It would help if we could
> maintain a list of "flaky tests" somewhere in git and have Yetus
> consider the flakiness of a test before -1ing a patch.  Right now, we
> pretty much all have that list in our heads, and we're not applying it
> very consistently.  Having this list would also let us know where to
> concentrate our efforts to fix things.
>
> On Sun, Nov 22, 2015 at 4:21 AM, Steve Loughran <stevel@hortonworks.com> wrote:
>>
>> Jenkins is pretty much dead in the water these days; a test run that works is a rare
miracle rather than the default state. Which also means most patches are being +1'd in even
though patches are failing, with comments like "the test failures are probably unrelated"
>>
>>
>> I think everyone has to be grateful that I'm not volunteering to be release manager
for 2.8, as if I were i'd have already imposed a block on any patches going in until jenkins
was stable. That is: nothing but test fixes would go in.
>>
>> as it is, at least for the next couple of weeks, I'm going to experiment with reverting
patches which break the build. Usually those breakages are being fixed, eventually, with followup
patches. With a "patches which break the build get reverted" policy, whoever submitted that
first patch gets to write the fix *and test it again*. This should encourage people to be
more rigorous first time round.
>>
>>
>>   1.  Yes, I'm going to have to be ruthless and do this for myself too. Or others
can. I'm not doing much (any?) core hadoop coding right now, so more isolated.
>>   2.  No, I don't plan to show favouritism: break the build and it gets rolled back.
>>   3.  We can review this in a week or two  to see how it goes. And someone else can
volunteer to keep jenkins happy.
>>   4.  I'll get a smaller fix for HDFS-9263 in.
>>   5.  I've also started running slider 0.90-SNAPSHOT test runs with Hadoop 2.8.0-SNAPSHOT,
so I'm being the first to find problems beyond jenkins. So far HADOOP-12050 is the first blocker.
It went in in August, which shows we aren't doing enough cross-version testing beyond just
Jenkins. That breakage (HADOOP-12587) is stopping my test code working against secure clusters
—if I was being really harsh I'd have reverted that too, but's been in long enough I think
a fix is probably the best solution.
>
> Well, this is already directly contracting point #2, isn't it? :)

Just to be clear, I'm not trying to imply that this was favoritism (I
don't think it was) but just that a revert is not always the right
solution.  A short discussion usually helps to find the right
solution, which could be a revert, a follow-on fix, or something else.

best,
Colin

>
> I am open to being more critical about patches going in, but I think
> we should have some very minimal discussion before reverting things.
> It's just polite.
>
> Colin
>
>
>>   6.  Finally: everyone should feel free to fix tests. Don't be shy now!
>>
>> Giving this is a US vacation week, it should be a quieter week for breakages.
>>
>> Sorry —but if we can't even get Jenkins stable, then what hope do we have for a
2.8 release working?
>>
>> -Steve
>>
>>

Mime
View raw message