hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Milind Bhandarkar <mbhandar...@linkedin.com>
Subject Re: Defining Hadoop Compatibility -revisiting-
Date Fri, 13 May 2011 06:57:17 GMT
Sure. As I said before, they are not mutually exclusive. Just stating my
experience that specs without a test suite are of no use. If I were to
prioritize, I would give priority to a TCK over natural-language specs.
That's all.

So far, I have seen many replacements for HDFS as InputFormat and
OutputFormat that reads from or writes to different data sources and syncs
only. It is easily imaginable to have a pluggable app managers and
resource manager after MR-279 (other than local, which is part of Apache
Hadoop, but not "compatible", think distributed cache).

So, we would need a spec and a test suite per component (I.e. App manager,
resource manager, current scheduler, replication target chooser,
authentication, authorization) now. If the binary protocols were to be
crystallized, I can imagine others implementing only the datanode, or a
task tracker. So we would need protocol-level compatibility suite for
individual daemons as well.

I agree with one of the statements that Steve L made, that "Hadoop has an
enviable problem of too much activity." If one follows the activities in
commercial world, open source, academic and industry-sponsored R&D, one
quickly realizes that writing RFCs for all the above components and fixing
them without versioning is cumbersome and difficult optimistically, and
near impossible realistically. Also, my experience is that keeping
standards documentation for an evolving technology up-to-date with the
proper implementation is a pipe-dream at best. A test suite that gets
compiled and run every time a new version comes out is within the realm of

Therefore, all I am saying is that, while a POSIX-like spec is a "nice to
have", a test-suite that defines compatibility is a must.

- milind

Milind Bhandarkar

On 5/12/11 10:38 PM, "Ted Dunning" <tdunning@maprtech.com> wrote:

>I would say that an English spec with associated test suite is a middle
>On Thu, May 12, 2011 at 9:52 PM, Milind Bhandarkar
>> wrote:
>> Ok, my mistake. They have only asked for documented specifications. I
>> have been influenced by all the specifications I have read. All of them
>> were in English, which is characterized as a natural language.
>> But then, if you are proposing a specification in a
>> isn't that called a test suite ? Or is there a middle ground ?
>> - milind
>> --
>> Milind Bhandarkar
>> mbhandarkar@linkedin.com
>> +1-650-776-3167
>> On 5/12/11 9:05 PM, "Ted Dunning" <tdunning@maprtech.com> wrote:
>> >Did anybody propose natural language only specifications?

View raw message