hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Json.org licensing, amazon-AWS and Jackson versions
Date Tue, 08 Nov 2016 11:04:01 GMT
in this case the json.org<http://json.org> classes are lurking inside the AMS JARs, so
it's not swappable, and it's not immediately obvious there's a problem. You need something
to scan all the JARs for forbidden .class files.

Oddly enough, Java ships with a tool to scan all the JARs for specific .class files, we call
it "the classloader". It would be possible for someone to write some parameter driven test
suite which attempted a loadResource() of  the forbidden classes, failing a test if one was
there. Subclass this suite into the various separate modules at the end of the DAG (hadoop-aws,
hadoop-azure), and we can use JUnit to implement the work



On 7 Nov 2016, at 19:14, Andrew Wang <andrew.wang@cloudera.com<mailto:andrew.wang@cloudera.com>>
wrote:

Have we looked into swapping in the Android cleanroom implementation of json.org<http://json.org/>?
The issue with Jackson bumps is always the classpath clashes with downstream projects.

https://wiki.debian.org/qa.debian.org/jsonevil
https://android.googlesource.com/platform/libcore/+/master/json/

Maybe we need to build it ourselves, but it's still better than bumping the Jackson version.


I'm wondering if we can't just produce our own shaded derivative of the AWS jar: merge in
the AWS artifacts unshaded, shade in its jackson dependency. This would let us use it in 2.7+
without worrying about jackson versions.

I'd still avoid it for 2.6.x, because I doubt new versions will be compatible with Java 6;
it's not worth worrying about.

I think I might give a lighting talk at Apachecon Big Data next week, "just because it''s
a project right to use an incompatible version of jackson, doesn't mean it's a duty".  I can
reminisce fondly about the Elder Days when Xerces didn't come with the JVM; every project
bundled Xerces and Xalan on the CP —but at least they were single JAR releases with stable
APIs.


On Mon, Nov 7, 2016 at 10:14 AM, Steve Loughran <stevel@hortonworks.com<mailto:stevel@hortonworks.com>>
wrote:

https://issues.apache.org/jira/browse/HADOOP-13794: JSON.org<http://JSON.org><http://JSON.org<http://json.org/>>
license is now forbidden by the ASF From distribution.


Which means we can't make any Hadoop releases with the AWS SDK JARs < =1.11.0 in them,
meaning https://issues.apache.org/jira/browse/HADOOP-13050 has moved up from a minor issue
to a blocker, and are going to have to worry about the older branches.

1. The latest amazon-AWS SDKs absolutely do not work with shipping jackson version: it even
references artifacts that don't appear until  Jackson 2.3.3; and needs to on a later version
than that to actually work.
2. AWS SDK updates have generally needed code changes (example: HADOOP-12269)

For 2.8.x we can increment the AWS SDK, and take this as a time to increment jackson, which
an XEE vulnerability was hinting at anwyay ( https://issues.apache.org/jira/browse/HADOOP-12705)
. I know this has a risk of problems, but Sean Mackrory has done the due diliegence to show
that Jackson 2.7.8 doesn't break existing API use in Hadoop; after that jackson goes incompatible
(again).


For Branch 2.6.x we may just want to take the easy way out, and not bundle the (very dated)
AWS JAR; just strip it out of the final set of artifacts to include in the project dist, and
tell people that if they want to use s3a in 2.6.x (which I think people should really avoid,
given it to too 2.7.1 to stabilize), then they need to manually install it.


Which leaves Hadoop 2.7.x, doesn't it? What to do? People are using s3a, it's working well,
and putting the AWS JARs are going to cause problems. But pushing up a Jackson update in a
2.7.x update is going to be traumatic.

-Steve


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message