In brief, Spark has several optional modules that are not distributed as part of the main binary release. One of them is "Spark Kinesis", which contains integration code for Amazon's Kinesis. (Note: here we are not talking about the Kinesis _assembly_
module.) This module is built and distributed as a binary artifact in Maven, however.
It does not itself contain any source or binary code from the Amazon Kinesis client. However, this optional Spark Kinesis module requires the Kinesis client of course. And, the Kinesis client is licensed under the Amazon Software License, which
is Category X.
CAN APACHE PROJECTS RELY ON COMPONENTS UNDER PROHIBITED LICENSES?
Apache projects cannot distribute any such components. As with the previous question on platforms, the component can be relied on if the component's licence terms do not affect the Apache product's licensing. For example, using a GPL'ed tool during
the build is OK.
CAN APACHE PROJECTS RELY ON COMPONENTS WHOSE LICENSING AFFECTS THE APACHE PRODUCT?
Apache projects cannot distribute any such components. However, if the component is only needed for optional features, a project can provide the user with instructions on how to obtain and install the non-included work. Optional means that the
component is not required for standard use of the product or for the product to achieve a desirable level of quality. The question to ask yourself in this situation is:
"Will the majority of users want to use my product without adding the optional components?"
discusses closely-related but not identical scenarios. For example, there the question is whether ASF projects can distribute
the recompiled binary code of a Category X component, and that's not allowed.
Here, the licensing does affect the product (it is not just a build dependency).
I see an argument for and against allowing publishing of Spark Kinesis in Maven.
Spark Kinesis is optional with respect to Spark, and thus so is Kinesis client. Publishing it via Maven constitutes providing "instructions on how to obtain and install the non-included work", Kinesis client.
Spark Kinesis is a software product from the ASF in and of itself. If it relies non-optionally on a Category X component, it may not be distributed.
Is there any view on which is more accurate in this context?
If the latter, does this forbid releasing Spark Kinesis in source form too?