hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Isaacson <...@cloudera.com>
Subject Re: [PROPOSAL] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack
Date Wed, 21 Nov 2012 23:00:06 GMT
On Wed, Nov 21, 2012 at 1:50 PM, Konstantin Boudnik <cos@apache.org> wrote:
> The main point of my argument expressed in a lesser than 100 words: adding
> Python that is inconsistent across different Linux distros and has a history
> of backward incompatibilities (2.6 vs 2.5, 3.0 vs earlier, etc.) doesn't seem
> to leverage the benefit of having a somewhat easier build in Windows.

This seems mostly like a red herring to me. I'll grant that it's hard
to build a complete Python stack that's compatible between Python 2.x
and 2.y, but it's very easy by following best practices to keep python
*scripts* compatible across all reasonable Python 2.x versions.
Simply pick an oldest-supported-version like 2.4 and be reasonably
disciplined to not use newer constructs or libraries. I wouldn't wish
to try to build a complete system in such a limited dialect [1], but
for "we need a reasonable replacement for /bin/sh scripts" it's just

Ignore Python 3 for the time being, it's a completely different
language with incompatible syntax and semantics that doesn't support
several currently-important platforms.  Maybe in a few years sane
people can consider moving to it, but for now it's best to just stick
with the compatible subset of Python 2.x.

[1] the Mercurial project has had a pretty good experience with this
scheme; http://mercurial.selenic.com/wiki/SupportedPythonVersions they
currently support 2.4 - 2.7 with a few required libraries. They
dropped 2.2 and 2.3 support a few years ago due to specific
shortcomings on those versions.


View raw message