hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Why there are so many revert operations on trunk?
Date Tue, 07 Jun 2016 07:53:10 GMT


This is a good summary, And we are better off resolving issues through discussions on emails,
rather than JIRA, and everyone behaving amicably towards each other.

I think we should be more willing to do feature branches, especially for things like

- anything deep into the codebase
- self-contained things
-something that will be a series of incremental patches, each with no real benefit on their
own. That is: things where adding to the codebase is, until complete, only a risk of regressions.

If yetus doesn't do preflight checks on feature branches, we can get that in, and set up Jenkins
nightly builds for the branches too (With different email policies: email to committers &
select listeners, not more noise on the deve lists)

Regarding where we are today

I don't know what can/should be done with the set of patches that just got moved to a feature
branch, except that Vinods "not in 2.8" veto holds there. As, presumably, does Andrew's "not
in 3.0 alpha without a Java 8 API"

Which means that any API which gets in a Java-7 compatible Hadoop release (2.9?) will have
to be on an API which we know won't last.

1. I propose adding a new stability state/tag, @Experimental, to warn users that this not
only "may" go away, but is in fact highly likely to, and any replacement may need big code
changes. The @Unstable tag has been devalued for this from its near-universalness: you see
the tag and ignore it.

2. Any proposed API for branch-2 must be tagged @Experimental, put that in the name of the
API ( that is, not, say AsyncFileSystem, but ExperimentalAsyncDelete}} or similar.

4. The long-term API is going to be: Java 8, strictly specified by a whole new section in
the FS API (yes, that is where my veto would come in. No spec, no tests: no commit). The tests
will be at part of the Abstract contract test suite, and, ideally, backed with another implementation
alongside that of HDFS. This could just be a thread-pooled thing working with a normal sync

5. People who intend to use the Async API —Hive, HBase, etc, get involved in that process,
ideally seeing how well they could get a branch of their own code to work with the API. That
would be a validation of the API itself, identify and force the clarification of any ambiguities,

(4) + (5) may seem expensive/slow, but if HBase and Hive dev teams are involved, it means
that what eventually gets into Hadoop is ready to backed by code downstream.

I'm not going to get involved here, except to warn that I will be reviewing the markdown specification
stuff. I'll help: people might want to help review the listFiles and listStatus operations
which I sat down to define recently: https://issues.apache.org/jira/browse/HADOOP-13207

> On 6 Jun 2016, at 22:36, Vinod Kumar Vavilapalli <vinodkv@apache.org> wrote:
> Folks,
> It is truly disappointing how we are escalating situations that can be resolved through
basic communication.
> Things that shouldn’t have happened
> - After a few objections were raised, commits should have simply stopped before restarting
again but only after consensus
> - Reverts (or revert and move to a feature-branch) shouldn’t have been unequivocally
done without dropping a note / informing everyone / building consensus. And no, not even a
release-manager gets this free pass. Not on branch-2, not on trunk, not anywhere.
> - Freaking out on -1’s and reverts - we as a community need to be less stigmatic about
-1s / reverts.
> Trunk releases:
> 	This is the other important bit about huge difference of expectations between the two
sides w.r.t trunk and branching. Till now, we’ve never made releases out of trunk. So in-progress
features that people deemed to not need a feature branch could go into trunk without much
trouble. Given that we are now making releases off trunk, I can see (a) the RM saying "no,
don’t put in-progress stuff and (b) the contributors saying “no we don’t want the overhead
of a branch”. I’ve raised related topics (but only focusing on incompatible changes) before
- http://markmail.org/message/m6x73t6srlchywsn <http://markmail.org/message/m6x73t6srlchywsn>
- but we never decided anything.
> We need to at the least force a reset of expectations w.r.t how trunk and small / medium
/ incompatible changes there are treated. We should hold off making a release off trunk before
this gets fully discussed in the community and we all reach a consensus.
>> * Without a user API, there's no way for people to use it, so not much
>> advantage to having it in a release
>> Since the code is separate and probably won't break any existing code, I
>> won't -1 if you want to include this in a release without a user API, but
>> again, I question the utility of including code that can't be used.
> Clearly, there are two sides to this argument. One side claims the absence of user-facing
public / stable APIs, and that for all purposes this is dead-code for everyone other than
the few early adopters who want to experiment with it. The other argument is to not put this
code before a user API. Again, I’d discuss with fellow community members before making what
the other side perceives as unacceptable moves.
> From 2.8.0 perspective, it shouldn’t have landed there in the first place - I have
been pushing for a release for a while with help only from a few members of the community.
But if you say that it has no material impact on the user story, having a by-default switched-off
feature that *doesn’t* destabilize the core release, I’d be willing to let it pass.
> +Vinod

View raw message