hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6248) Circus: Proposal and Preliminary Code for a Hadoop System Testing Framework
Date Thu, 17 Sep 2009 01:32:57 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756324#action_12756324

Chris Douglas commented on HADOOP-6248:

bq. As the proposal states, this is a framework, with enough context examples and tests to
show how the framework is used

Frameworks impose a discipline on the end user. They make decisions about the admissible form
of a solution and propose a model for conceiving of problems within its space. In return,
the user isn't merely relieved of the burden of writing boilerplate code, but they're offered
a compelling way to think about their problem. Map/reduce is a good example. It forces a particular
model of parallel execution on the user, frustrating people who want to use it as a resource
allocator for a different parallel model, but for some problems, it's an admissible, productive
abstraction _in addition to_ a way to avoid writing all the intermediate code. The latter
is nice, but the former is what makes it successful.

The concepts "context" and "test" in Circus are too vague to admit the possibility of discipline,
and because the tool makes no bold choices, it has no taste. It's an execution engine equal
to any other, a generic "for" loop with semantics. What is the case for selecting these semantics
over any others?

bq. Circus will let an organization write a context that uses a development cluster of some
sort, along with tests that emulate their production jobs, to ensure that their jobs are running
as expected on their development cluster. Then, by simply switching contexts, the organization
can run all of their jobs on a different version of Hadoop.

This solves the wrong side of the problem, unless the deltas are small, e.g. one is trying
to test whether a release of Hadoop 0.x from provider P will work like release 0.x from provider
Q: a contribution of questionable interest to Apache Hadoop. Cross-version compatibility still
has too many corner cases to usefully distill into a "context", and similarly "as expected"
has too many dimensions to express as a binary state. Whether performance is acceptable, configuration
appropriate, results accurate, SLAs satisfied, etc. are all useful questions to ask. "The
end user can write a shell script to verify any of these" is exactly the point I make above.
Organizations need to evaluate all these factors, but I'm skeptical of an attempt to roll
all of these questions into a single, automated tool, particularly if the tool begins with
this ambition.

bq. My game plan is to use Circus to write some interesting system tests that aren't currently
in Hadoop's test plan. [...] I expect to tackle testing distcp across different versions of
Hadoop and HDFS upgrades.

Of course this is tested. It's often tested manually and at scale, but the problem is deployment
and any necessary investigation, not selecting the distribution and configuring the submitting

bq. What are your specific objections to calling bin/hadoop-daemon.sh and bin/hadoop, except
that doing so is one more level of indirection?

It's only one more layer of indirection. As I said earlier, part of this is a packaging problem:
if we had a service API to start/stop/etc. Hadoop from the client, then one could more easily
develop tools like this while adhering to some sort of contract. Because services are started
and stopped via opaque shell scripts, Hadoop is failing in the way I describe above, by not
providing tool writers with a coherent model for the service. This is the some of the motivation
behind the service lifecycle branch.

So this doesn't just need "more;" its premise is unlikely to yield a tool that can be evaluated
and included in the distribution. If it were scaled back to solve a particular problem and
propose a model for it, it is more likely to find success, acceptance, and adoption.

> Circus: Proposal and Preliminary Code for a Hadoop System Testing Framework
> ---------------------------------------------------------------------------
>                 Key: HADOOP-6248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6248
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: test
>         Environment: Python, bash
>            Reporter: Alex Loddengaard
>         Attachments: HADOOP-6248.diff, HADOOP-6248_v2.diff, HADOOP-6248_v3.diff
> This issue contains a proposal and preliminary source code for Circus, a Hadoop system
testing framework.  At a high level, Circus will help Hadoop users and QA engineers to run
system tests on a configurable Hadoop cluster, or distribution of Hadoop.  See the comment
below for the proposal itself.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message