hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@altiscale.com>
Subject Re: IMPORTANT: automatic changelog creation
Date Thu, 02 Apr 2015 19:13:29 GMT

On Apr 2, 2015, at 11:36 AM, Mai Haohui <ricetons@gmail.com> wrote:

> Hi Allen,
> Thanks for driving this. Just some quick questions:
>>>       Removing changes.txt, relnotes.py, etc from branch-2 would be an incompatible
change.  Pushing aside the questions of that document’s quality (hint: lots of outright
lying and missing several hundred jiras), it's effectively an interface in used by quite a
few folks.
> Why removing CHANGES.txt  is an incompatible change? Why CHANGES.txt
> is an interface? Can you give some examples?

	With my end user ops hat on, for years I'd often run scripts over CHANGES.TXT to pull key
things in releases including to get extra metadata that wasn’t in that file and reformat
for my users to digest.  (especially since the release notes weren’t published with the
release tar and—let’s be honest--were mostly indecipherable heaps of crap to the point
that even the RM’s never bothered to really look at them...) CHANGES.txt was useful to get
the base dataset, esp in the days before JIRA’s REST interface.

	It is/was, in essence, an interface.

> It looks like that the meaning of incompatibility is overloaded -- at
> the very least, in
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html,
> compatibility means source and binary compatibility.

	FWIW, removing relnotes.py is definitely covered by that document.

> At least to me that CHANGES.txt is not part of the contract of
> compatibility. It would be great to see this patch to occur in
> branch-2.

	But yes, I mean beyond that.  It’s a ‘de facto’ standard given how many people use
it for critical information about what we’ve released.  This is about managing user expectations
and not just what’s convenient for us.  You know, that whole community that we always mention
but seem to stomp all over.  Just because we CAN do something doesn’t mean we SHOULD.

	An excellent example of this is the HADOOP_OPTS variable.  I’d LOVE LOVE LOVE to kick it
to the curb.  It’s the source of a LOT of end user bugs and problematic areas in the shell
code.  During the rewrite +  the above rules, I had the opportunity and bylaws standing to
do so.  But I didn’t because it’d just flat out break too much stuff, known and unknown.

	It’s ok to be conservative when it comes to change.

View raw message