hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11984) Enable parallel JUnit tests in pre-commit.
Date Mon, 18 May 2015 18:22:01 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548455#comment-14548455
] 

Allen Wittenauer commented on HADOOP-11984:
-------------------------------------------

bq. It seems to me that it is more of an apple vs orange comparison – more importantly,
does the time parsing TEST-*xml (which takes seconds at maximum) actually matter, give the
fact that in general Jenkins spends 15 mins to build the trunk, and ~2 hours to run the tests?

~2 hours only for HDFS.  The next closest (IIRC) is mapreduce-jobclient which comes in at
20 minutes.  Perhaps the HDFS folks should take a serious look at re-arranging the universe,
not running integration tests in unit tests, start paying attention to the nightly build,
etc.

bq. Popping up one level – it looks like you have some concerns on moving test-patch to
other scripting languages that have more choices of libraries.

deadhorse.gif

Python, ruby, etc, all suffer from the same problem: which version do you target to get the
maximum amount of coverage?  test-patch, like the user-client code, MUST be able to run in
a variety of hostile environments. (No, Mac OS X and Linux are NOT good enough.)  python,
frankly, sucks at that because the API is continually evolving in incompatible ways.(*)  ...
and that's before we even get into the morass of add-ons.  And python 3.x.

FWIW, the *only* big portability problem with the current version of test-patch.sh that I'm
aware of is one usage of GNU diff because I was too lazy to write more complex awk to work
around it.  Otherwise, it's all POSIX+bash 3.x and should run even on fairly ancient systems
unchanged!  The outlook for *forward* compatibility, as a result, is extremely good. It's
pretty much impossible to do that with most other language choices (including, ironically,
Java).... except maybe one:

If I had my way, I'd have written this in perl 5.  It's a significantly better choice for
the things we need to do here (text processing! OS manipulation!) and it's compatibility across
versions deployed with every relatively modern OS that I'm aware of is extremely high.  But
we don't do perl, have a small tolerance for python, and the rest is in bash.  So given those
choices, it was an easy one to make.

bq. I'm wondering whether there are anythings can be done to improve the maintainability and
reduce the bars of getting involved (e.g., reusing libraries from other scripting languages)
in the longer term.

There are plenty of people who are fully competent to write decent bash.  We just don't invite
them into the Hadoop tent.  The number of people contributing to the parts that I've rewritten
have gone up SIGNIFICANTLY because people who have these skills realize that someone is paying
attention.  As a side note, I personally think it's great if the Java folks feel uncomfortable
that code that they don't understand is in the system. 

(*) - while working on releasedocmaker, I heard two conflicting things:  "that API is deprecated
you should use xyz" and "oh, make sure this works with python vx.x".  Guess what? I can't
use the non-deprecated API in vx.x.  So deprecated APIs here we come, which now means I'm
continually answering the question of "why does this code use method y?". 



> Enable parallel JUnit tests in pre-commit.
> ------------------------------------------
>
>                 Key: HADOOP-11984
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11984
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: scripts
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-11984.001.patch, HADOOP-11984.002.patch, HADOOP-11984.003.patch,
HADOOP-11984.004.patch
>
>
> HADOOP-9287 and related issues implemented the parallel-tests Maven profile for running
JUnit tests in multiple concurrent processes.  This issue proposes to activate that profile
during pre-commit to speed up execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message