hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Defining Compatibility
Date Mon, 31 Jan 2011 13:18:37 GMT
what does it mean to be compatible with Hadoop? And how do products that 
consider themselves compatible with Hadoop say it?

We have plugin schedulers and the like, and all is well, and the Apache 
brand people keep an eye on distributions of the Hadoop code and make 
sure that Apache Hadoop is cleanly distinguished from redistributions of 
binaries by third parties.

But then you get distributions, and you have to define what is meant in 
terms of functionality and compatibility

Presumably, everyone who issues their own release has either explicitly 
or implicitly done a lot more testing than is in the unit test suite, 
testing that exists to stress test the code on large clusters -is there 
stuff there that needs to be added to SVN to help say a build is of 
sufficiently quality to be released?

Then there are the questions about

-things that work with specific versions/releases of Hadoop?
-replacement filesystems ?
-replacement of core parts of the system, like the MapReduce Engine?

IBM have have been talking about "Hadoop on GPFS"

If this is running the MR layer, should it say "Apache Hadoop MR engine 
on top of IBM GPFS", or what -and how do you define or assess 
compatibility at this point? Is it up to the vendor to say "works with 
Apache Hadoop", and is running the Terasort client code sufficient to 
say "compatible"?

Similarly, if the MapReduce engine gets swapped out, what then? We in HP 
Labs have been funding some exploratory work at universities in Berlin 
on an engine that does more operations than just map and reduce, but it 
will also handle the existing operations with API compatibility on the 
worker nodes. The goal here is research with an OSS deliverable, but 
while it may support Hadoop jobs, it's not Hadoop.

What to call such things?


View raw message