hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8368) Use CMake rather than autotools to build native code
Date Tue, 29 May 2012 22:03:24 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13285171#comment-13285171
] 

Colin Patrick McCabe commented on HADOOP-8368:
----------------------------------------------

bq. A bunch ofcmake files are added at src/ level in common/hdfs/mapreduce modules and in
hadoop-mapreduce-project/. These files should go into src/main/native or the corresponding
module. In the case of hadoop-mapreduce-project/ the cmake file should go in the module it
is being used (not in this POM aggregator module).

I guess I can get rid of the include files for now.  Currently none are being included from
multiple places.  We can reconsider how to do this when we have multiple CMakeLists.txt files
per project.

bq. Are all native testcases run during Maven test phase?

All the native testcases that used to be run during the maven test phase are still run.

We still have to wire up the fuse_dfs testcase, but there is a separate JIRA for that: HDFS-3250.
 hdfs_test is another one that still needs to be wired up somehow (it's more of a system test,
and requires an HDFS cluster), but I think that's out of scope for this JIRA.

bq. In the hadoop-common POM the variable runas.home is set by default to EMPTY. the build
seems to work, is this OK?

Yes.  runAs is a tool that is not built by default.  The old build had similar behavior where
you had to specify extra options to get runAs to build.

bq. Finally, this is nice to have. In the current autoconf build native testcases are not
skipped if maven is invoked with -DskipTests, any chance to do that skip with cmake?

This patch preserves that same behavior.
                
> Use CMake rather than autotools to build native code
> ----------------------------------------------------
>
>                 Key: HADOOP-8368
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8368
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 2.0.0-alpha
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HADOOP-8368.001.patch, HADOOP-8368.005.patch, HADOOP-8368.006.patch,
HADOOP-8368.007.patch, HADOOP-8368.008.patch, HADOOP-8368.009.patch, HADOOP-8368.010.patch,
HADOOP-8368.012.half.patch, HADOOP-8368.012.patch, HADOOP-8368.012.rm.patch, HADOOP-8368.014.trimmed.patch,
HADOOP-8368.015.trimmed.patch, HADOOP-8368.016.trimmed.patch, HADOOP-8368.018.trimmed.patch,
HADOOP-8368.020.rm.patch, HADOOP-8368.020.trimmed.patch, HADOOP-8368.021.trimmed.patch, HADOOP-8368.023.trimmed.patch
>
>
> It would be good to use cmake rather than autotools to build the native (C/C++) code
in Hadoop.
> Rationale:
> 1. automake depends on shell scripts, which often have problems running on different
operating systems.  It would be extremely difficult, and perhaps impossible, to use autotools
under Windows.  Even if it were possible, it might require horrible workarounds like installing
cygwin.  Even on Linux variants like Ubuntu 12.04, there are major build issues because /bin/sh
is the Dash shell, rather than the Bash shell as it is in other Linux versions.  It is currently
impossible to build the native code under Ubuntu 12.04 because of this problem.
> CMake has robust cross-platform support, including Windows.  It does not use shell scripts.
> 2. automake error messages are very confusing.  For example, "autoreconf: cannot empty
/tmp/ar0.4849: Is a directory" or "Can't locate object method "path" via package "Autom4te..."
are common error messages.  In order to even start debugging automake problems you need to
learn shell, m4, sed, and the a bunch of other things.  With CMake, all you have to learn
is the syntax of CMakeLists.txt, which is simple.
> CMake can do all the stuff autotools can, such as making sure that required libraries
are installed.  There is a Maven plugin for CMake as well.
> 3. Different versions of autotools can have very different behaviors.  For example, the
version installed under openSUSE defaults to putting libraries in /usr/local/lib64, whereas
the version shipped with Ubuntu 11.04 defaults to installing the same libraries under /usr/local/lib.
 (This is why the FUSE build is currently broken when using OpenSUSE.)  This is another source
of build failures and complexity.  If things go wrong, you will often get an error message
which is incomprehensible to normal humans (see point #2).
> CMake allows you to specify the minimum_required_version of CMake that a particular CMakeLists.txt
will accept.  In addition, CMake maintains strict backwards compatibility between different
versions.  This prevents build bugs due to version skew.
> 4. autoconf, automake, and libtool are large and rather slow.  This adds to build time.
> For all these reasons, I think we should switch to CMake for compiling native (C/C++)
code in Hadoop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message