+1.
Apache foundation or contributors to Apache should not waste their energy
providing such certification.
Compatibility claims should be easily verifiable by users of these
proprietary systems or independent observers, if a test-suite were readily
available to run.
>The Hadoop mark should only be used to refer to open-source software
>produced by the ASF.
IANAL, but Steve is questioning usage of "Apache Hadoop Compatible" in PR
material of commercial software. Is this considered as usage of "The
Hadoop mark" ?
- milind
--
Milind Bhandarkar
mbhandarkar@linkedin.com
+1-650-776-3167
On 5/12/11 11:16 PM, "Doug Cutting" <cutting@apache.org> wrote:
>Certification semms like mission creep. Our mission is to produce
>open-source software. If we wish to produce testing software, that
>seems fine. But running a certification program for non-open-source
>software seems like a different task.
>
>The Hadoop mark should only be used to refer to open-source software
>produced by the ASF. If other folks wish to make factual statements
>concerning our software, e.g., that their proprietary software passes
>tests that we've created, that may be fine, but I don't think we should
>validate those claims by granting certifications to institutions. That
>ventures outside the mission of the ASF. We are not an accrediting
>organization.
>
>Doug
>
>On 05/10/2011 12:29 PM, Steve Loughran wrote:
>>
>> Back in Jan 2011, I started a discussion about how to define Apache
>> Hadoop Compatibility:
>>
>>http://mail-archives.apache.org/mod_mbox/hadoop-general/201101.mbox/%3C4D
>>46B6AD.2020802@apache.org%3E
>>
>>
>> I am now reading EMC HD "Enterprise Ready" Apache Hadoop datasheet
>>
>>
>>http://www.greenplum.com/sites/default/files/EMC_Greenplum_HD_DS_Final_1.
>>pdf
>>
>>
>> It claims that their implementations are 100% compatible, even though
>> the Enterprise edition uses a C filesystem. It also claims that both
>> their software releases contain "Certified Stacks", without defining
>> what Certified means, or who does the certification -only that it is an
>> improvement.
>>
>>
>> I think we should revisit this issue before people with their own
>> agendas define what compatibility with Apache Hadoop is for us
>>
>>
>> Licensing
>> -Use of the Hadoop codebase must follow the Apache License
>> http://www.apache.org/licenses/LICENSE-2.0
>> -plug in components that are dynamically linked to (Filesystems and
>> schedulers) don't appear to be derivative works on my reading of this,
>>
>> Naming
>> -this is something for branding@apache, they will have their opinions.
>> The key one is that the name "Apache Hadoop" must get used, and it's
>> important to make clear it is a derivative work.
>> -I don't think you can claim to have a Distribution/Fork/Version of
>> Apache Hadoop if you swap out big chunks of it for alternate
>> filesystems, MR engines, etc. Some description of this is needed
>> "Supports the Apache Hadoop MapReduce engine on top of Filesystem XYZ"
>>
>> Compatibility
>> -the definition of the Hadoop interfaces and classes is the Apache
>> Source tree,
>> -the definition of semantics of the Hadoop interfaces and classes is
>> the Apache Source tree, including the test classes.
>> -the verification that the actual semantics of an Apache Hadoop release
>> is compatible with the expected semantics is that current and future
>> tests pass
>> -bug reports can highlight incompatibility with expectations of
>> community users, and once incorporated into tests form part of the
>> compatibility testing
>> -vendors can claim and even certify their derivative works as
>> compatible with other versions of their derivative works, but cannot
>> claim compatibility with Apache Hadoop unless their code passes the
>> tests and is consistent with the bug reports marked as ("by design").
>> Perhaps we should have tests that verify each of these "by design"
>> bugreps to make them more formal.
>>
>> Certification
>> -I have no idea what this means in EMC's case, they just say
>>"Certified"
>> -As we don't do any certification ourselves, it would seem impossible
>> for us to certify that any derivative work is compatible.
>> -It may be best to state that nobody can certify their derivative as
>> "compatible with Apache Hadoop" unless it passes all current test suites
>> -And require that anyone who declares compatibility define what they
>> mean by this
>>
>> This is a good argument for getting more functional tests out there
>> -whoever has more functional tests needs to get them into a test module
>> that can be used to test real deployments.
>>
|