hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: [DISCUSS] A final minor release off branch-2?
Date Wed, 15 Nov 2017 21:25:18 GMT
> From recent classpath isolation work, I was surprised to find out that
many of our downstream projects (HBase, Tez, etc.) are still consuming many
non-public, server side APIs of Hadoop, not saying the projects/products
outside of hadoop ecosystem. Our API compatibility test does not (and
should not) cover these cases and situations. We can claim that new major
release shouldn't be responsible for these private API changes.

Would you consider filing HBase JIRAs for what are in your opinion the
worst offenses? We can at least take a look.



On Wed, Nov 15, 2017 at 1:37 AM, Junping Du <jdu@hortonworks.com> wrote:

> Thanks Vinod to bring up this discussion, which is just in time.
>
> I agree with most responses that option C is not a good choice as our
> community bandwidth is precious and we should focus on very limited
> mainstream branches to develop, test and deployment. Of course, we should
> still follow Apache way to allow any interested committer for rolling up
> his/her own release given specific requirement over the mainstream releases.
>
> I am not biased on option A or B (I will discuss this later), but I think
> a bridge release for upgrading to and back from 3.x is very necessary.
> The reasons are obviously:
> 1. Given lesson learned from previous experience of migration from 1.x to
> 2.x, no matter how careful we tend to be, there is still chance that some
> level of compatibility (source, binary, configuration, etc.) get broken for
> the migration to new major release. Some of these incompatibilities can
> only be identified in runtime after GA release with widely deployed in
> production cluster - we have tons of downstream projects and numerous
> configurations and we cannot cover them all from in-house deployment and
> test.
>
> 2. From recent classpath isolation work, I was surprised to find out that
> many of our downstream projects (HBase, Tez, etc.) are still consuming many
> non-public, server side APIs of Hadoop, not saying the projects/products
> outside of hadoop ecosystem. Our API compatibility test does not (and
> should not) cover these cases and situations. We can claim that new major
> release shouldn't be responsible for these private API changes. But given
> the possibility of breaking existing applications in some way, users could
> be very hesitated to migrate to 3.x release if there is no safe solution to
> roll back.
>
> 3. Beside incompatibilities, there is also possible to have performance
> regressions (lower throughput, higher latency, slower job running, bigger
> memory footprint or even memory leaking, etc.) for new hadoop releases.
> While the performance impact of migration (if any) could be neglectable to
> some users, other users could be very sensitive and wish to roll back if it
> happens on their production cluster.
>
> As Andrew mentioned in early email threads, some work has been done for
> verifying rolling upgrade from 2.x to 3.0 (just curious that which 2.x
> release is tested to upgrade from? 2.8.2 or 2.9.0 which is still in
> releasing?). But I am not aware any work we are doing now to test downgrade
> from 3.0 to 2.x (correct me if I miss any work). If users hit any of three
> situations I mentioned above then we should give them the chance to roll
> back if they are really conservative to these unexpected side-effect of
> upgrading. Given this, we should have this bridge release to cover the case
> for 3.0 safely roll back (no matter rolling or not). I am not sure it
> should be 2.9.x or 2.10.x for now (we can just call it 2.BR release)
> because we are not sure what exactly changes we should include for
> supporting roll back from 3.0 at this moment. We can defer this decision to
> discuss later when we have better ideas.
>
> Summary for my two cents:
> - No more feature release should happen on branch-2. 2.9 or 2.10 should be
> the last minor release (mainstream of community) on branch-2
>
> - A bridge release is necessary for safely upgrade/downgrade to 3.x
>
> - We can decide later to see if 2.10 is necessary when scope of the bridge
> release is more clear.
>
>
> Thanks,
>
> Junping
>
> ________________________________________
> From: Andrew Wang <andrew.wang@cloudera.com>
> Sent: Tuesday, November 14, 2017 2:25 PM
> To: Wangda Tan
> Cc: Steve Loughran; Vinod Kumar Vavilapalli; Kai Zheng; Arun Suresh;
> common-dev@hadoop.apache.org; yarn-dev@hadoop.apache.org; Hdfs-dev;
> mapreduce-dev@hadoop.apache.org
> Subject: Re: [DISCUSS] A final minor release off branch-2?
>
> To follow up on my earlier email, I don't think there's need for a bridge
> release given that we've successfully tested rolling upgrade from 2.x to
> 3.0.0. I expect we'll keep making improvements to smooth over any
> additional incompatibilities found, but there isn't a requirement that a
> user upgrade to a bridge release before upgrading to 3.0.
>
> Otherwise, I don't have a strong opinion about when to discontinue branch-2
> releases. Historically, a release line is maintained until interest in it
> wanes. If the maintainers are taking care of the backports, it's not much
> work for the rest of us to vote on the RCs.
>
> Best,
> Andrew
>
> On Mon, Nov 13, 2017 at 4:19 PM, Wangda Tan <wheeleast@gmail.com> wrote:
>
> > Thanks Vinod for staring this,
> >
> > I'm also leaning towards the plan (A):
> >
> >
> >
> >
> > * (A)    -- Make 2.9.x the last minor release off branch-2    -- Have a
> > maintenance release that bridges 2.9 to 3.x    -- Continue to make more
> > maintenance releases on 2.8 and 2.9 as necessary*
> >
> > The only part I'm not sure is having a separate bridge release other than
> > 3.x.
> >
> > For the bridge release, Steve's suggestion sounds more doable:
> >
> > ** 3.1+ for new features*
> > ** fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation*
> > ** whoever puts their hand up to do 2.x releases deserves support in
> > testing &c*
> > ** If someone makes a really strong case to backport a feature from 3.x
> to
> > branch-2 and its backwards compatible, I'm not going to stop them. It's
> > just once 3.0 is out and a 3.1 on the way, it's less compelling*
> >
> > This makes community can focus on 3.x releases and fill whatever gaps of
> > migrating from 2.x to 3.x.
> >
> > Best,
> > Wangda
> >
> >
> > On Wed, Nov 8, 2017 at 3:57 AM, Steve Loughran <stevel@hortonworks.com>
> > wrote:
> >
> >>
> >> > On 7 Nov 2017, at 19:08, Vinod Kumar Vavilapalli <vinodkv@apache.org>
> >> wrote:
> >> >
> >> >
> >> >
> >> >
> >> >> Frankly speaking, working on some bridging release not targeting any
> >> feature isn't so attractive to me as a contributor. Overall, the final
> >> minor release off branch-2 is good, we should also give 3.x more time to
> >> evolve and mature, therefore it looks to me we would have to work on two
> >> release lines meanwhile for some time. I'd like option C), and suggest
> we
> >> focus on the recent releases.
> >> >
> >> >
> >> >
> >> > Answering this question is also one of the goals of my starting this
> >> thread. Collectively we need to conclude if we are okay or not okay
> with no
> >> longer putting any new feature work in general on the 2.x line after
> 2.9.0
> >> release and move over our focus into 3.0.
> >> >
> >> >
> >> > Thanks
> >> > +Vinod
> >> >
> >>
> >>
> >> As a developer of new features (e.g the Hadoop S3A committers), I'm
> >> mostly already committed to targeting 3.1; the code in there to deal
> with
> >> failures and retries has unashamedly embraced java 8 lambda-expressions
> in
> >> production code: backporting that is going to be traumatic in terms of
> >> IDE-assisted code changes and the resultant diff in source between
> branch-2
> >> & trunk. What's worse, its going to be traumatic to test as all my JVMs
> >> start with an 8 at the moment, and I'm starting to worry about whether I
> >> should bump a windows VM up to Java 9 to keep an eye on Akira's work
> there.
> >> Currently the only testing I'm really doing on java 7 is yetus branch-2
> &
> >> internal test runs.
> >>
> >>
> >> 3.0 will be out the door, and we can assume that CDH will ship with it
> >> soon (*)  which will allow for a rapid round trip time on inevitable
> bugs:
> >> 3.1 can be the release with compatibility tuned, those reported issues
> >> addressed. It's certainly where I'd like to focus.
> >>
> >>
> >> At the same time: 2.7.2-2.8.x are the broadly used versions, we can't
> >> just say "move to 3.0" & expect everyone to do it, not given we have
> >> explicitly got backwards-incompatible changes in. I don't seen people
> >> rushing to do it until the layers above are all qualified (HBase, Hive,
> >> Spark, ...). Which means big users of 2.7/2,8 won't be in a rush to move
> >> and we are going to have to maintain 2.x for a while, including security
> >> patches for old versions. One issue there: what if a patch (such as
> bumping
> >> up a JAR version) is incompatible?
> >>
> >> For me then
> >>
> >> * 3.1+ for new features
> >> * fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation
> >> * whoever puts their hand up to do 2.x releases deserves support in
> >> testing &c
> >> * If someone makes a really strong case to backport a feature from 3.x
> to
> >> branch-2 and its backwards compatible, I'm not going to stop them. It's
> >> just once 3.0 is out and a 3.1 on the way, it's less compelling
> >>
> >> -Steve
> >>
> >> Note: I'm implicitly assuming a timely 3.1 out the door with my work
> >> included, all all issues arriving from 3,0 fixed. We can worry when 3.1
> >> ships whether there's any benefit in maintaining a 3.0.x, or whether
> it's
> >> best to say "move to 3.1"
> >>
> >>
> >>
> >> (*) just a guess based the effort & test reports of Andrew & others
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
> >> For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org
> >>
> >>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org
>
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message