hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Compatibility guidelines for toString overrides
Date Sat, 14 May 2016 20:01:35 GMT

On 14 May 2016, at 18:39, Allen Wittenauer <aw@apache.org<mailto:aw@apache.org>>

On May 12, 2016, at 12:20 PM, Chris Nauroth <cnauroth@hortonworks.com<mailto:cnauroth@hortonworks.com>>

Hello Allen,

The intent is not for this rule to override other compatibility rules,
such as the important CLI output rule.  It's also not intended to give us
free reign to change existing toString() implementations without due
diligence.  If a patch changes an existing toString() implementation that
already goes out to the shell or any other form of external serialization,
then the patch needs to be declined.  (I concur that relying on toString()
like this should never be done, and I'd encourage us to fix any
occurrences we find.)

The problem is that we as a community do a terrible job on following the compatibility guidelines
when it comes to CLI output.  Because while this maybe true:

Instead, the intent is to advertise to Java API consumers that toString()
output may evolve freely, and therefore we recommend against writing Java
code that depends on that output format.

… what’s going to happen is people are going to assume that because toString is now considered
Unstable, all output will be considered Unstable by way of mental short cut.  There have been
many, many instances over the past year or two where even long time PMC members have claimed
CLI output was already Unstable/outside the scope of the compatibility guidelines and I just
see this re-inforcing that (incorrect) world view.

HDFS-9732 is a good example of how to handle this.  I didn't explicitly -1
it, but I did call out the CLI compatibility problem and recommend how to
change the patch so that it preserves compatibility.

Does this help address your concerns, or is the full code audit the only
thing that would convince you to lift your -1?

No, because “perception is reality”.  If we want to make toString Unstable, then we need
to be very clear as to which toStrings are Unstable and which aren’t.  This isn’t a universal
rule without doing some legwork.

I am painfully coming round to this point of view *on existing classes*

  1.  We need some interface "StableString" which we can tag to say "this is stable"; new
CLI-ready code can use it.
  2.  We need to see where we use stuff today., by backchaining from uses of System.out &
System.err, and from TableListing uses especially (which should implement StableString itself
  3.  Or: look at where toString() is subclassed and go from there...there's less than 800
such overrides. But here you can't tell where the use of a string is going to be. There are
obvious ones (DU, DF, others which appear to be useful, AclEntry.toString(), but many which
look like IDE-generated patterns for debug.

There's another tactic, which would be to add some explicit log diagnostics method which we
offered no guarantees on ... but the challenge then becomes one of "how to use efficiently
in logging". That could be addressed, equally painfully, by having our own log class (extending/mimicing
slfj4j and which handles that message specially. That would be an absolute PITA to work with.
I think I'd rather look at our command line use of toString values

Now, what about new bits of code?
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message