hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11929) add test-patch plugin points for customizing build layout
Date Fri, 29 May 2015 21:04:19 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14565403#comment-14565403

Colin Patrick McCabe commented on HADOOP-11929:

bq. Allen wrote: I think it's worthwhile pointing out that test-patch is NOT meant to be the
nightly build. It's meant to be an extremely quick check to see if the patch is relatively
sane. It shouldn't be catching every possible problem with a patch; that's what integration
tests are for. Hadoop has a bad culture of ignoring the nightly build, but it's increasingly
important to catch some of these potential side-effects.

Allen, I could be misinterpreting here, but it sounds like you are advocating doing less testing
in the precommit build and relying more on the nightly and weekly builds.  I strongly disagree
with that sentiment.  It is much easier to catch bad changes before they go in than to clean
up after the fact.  Diagnosing a problem in the nightly build usually requires digging through
the git history and sometimes even bisecting.  It's more work for everyone.  Plus, having
the bad change in during the day causes problems for other developers.

Hadoop has always had robust precommit testing, and I think that's a very good thing.  If
we should be doing anything, it is increasing the amount of precommit testing.

bq. Sean wrote: Right, this patch is expressly to stop trying to figure everything out in
a clever way and just let the project declare what needs to happen.

Thanks for the clarification, Sean.  I agree that allowing people to explicitly declare dependencies
could in theory lead to a faster build.  But what happens when the dependency rules are wrong?
 Or people move a file, or create new files?  It seems like there is a high potential for
mistakes to be made here.  We could get into a situation where a piece of code wasn't being
tested for months or even years (it's happened in the past...)

The whole premise of this change is that we should spend more human time building and maintaining
a complex explicit dependency management infrastructure to save CPU time on the build cluster.
 But that seems backwards to me.  CPU cycles are cheap and only getting cheaper.  Human time
is very expensive, and (especially for the native build) hard to get.  I could point you to
JIRAs for fixing problems in the native code that are years old.  Sometimes for very simple
and basic things.

I also think we could explore much simpler and more robust ways of saving time on the precommit
build.  For example, we could parallelize our {{make}} invocations or set up a local mirror
of the Maven jars to avoid downloading them from offsite.

> add test-patch plugin points for customizing build layout
> ---------------------------------------------------------
>                 Key: HADOOP-11929
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11929
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Sean Busbey
>            Assignee: Allen Wittenauer
>            Priority: Minor
>         Attachments: HADOOP-11929.00.patch, HADOOP-11929.01.patch, HADOOP-11929.02.patch,
HADOOP-11929.03.patch, hadoop.sh
> Sean Busbey and I had a chat about this at the Bug Bash. Here's the proposal:
>   * Introduce the concept of a 'personality module'.
>   * There can be only one personality.
>   * Personalities provide a single function that takes as input the name of the test
current being processed
>   * This function uses two other built-in functions to define two queues: maven module
name and profiles to use against those maven module names
>   * If something needs to be compiled prior to this test (but not actually tested), the
personality will be responsible for doing that compilation
> In hadoop, the classic example is hadoop-hdfs needs common compiled with the native bits.
So prior to the javac tests, the personality would check CHANGED_MODULES, see hadoop-hdfs,
and compile common w/ -Pnative prior to letting test-patch.sh do the work in hadoop-hdfs.
Another example is our lack of test coverage of various native bits. Since these require profiles
to be defined prior to compilation, the personality could see that something touches native
code, set the appropriate profile, and let test-patch.sh be on its way.
> One way to think of it is some higher order logic on top of the automated 'figure out
what modules and what tests to run' functions.

This message was sent by Atlassian JIRA

View raw message