hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Kambatla <ka...@cloudera.com>
Subject Re: [DISCUSS] Treating LimitedPrivate({"MapReduce"}) as Public APIs for YARN applications
Date Wed, 11 May 2016 21:10:36 GMT
I wonder if we should add another annotation between @Private and @Public.
Can that be @LimitedPrivate itself?

There are some APIs we shouldn't expect end-users to recompile even across
major versions (e.g. FileSystem, JobClient). On the other hand, requiring a
Yarn application to recompile seems reasonable.

As Hitesh suggested, would it make sense to mark this @LimitedPrivate and
not just @LimitedPrivate{MR}? And, update our guidelines to say expect to
recompile code for major releases and that there could be semantic
incompatibilities?

On Tue, May 10, 2016 at 4:19 PM, Colin McCabe <cmccabe@apache.org> wrote:

> Thanks for explaining, Chris.  I generally agree that
> UserGroupInformation should be annotated as Public rather than
> LimitedPrivate, although you guys have more context than I do.
>
> However, I do think it's important that we clarify that we can break
> public APIs across a major version transition such as 2.x -> 3.x.  It
> would be particularly nice to remove a lot of the static or global state
> in UGI, although I don't know if we'll get to that before 3.0 is
> released.
>
> best,
> Colin
>
> On Tue, May 10, 2016, at 14:46, Chris Nauroth wrote:
> > Yes, I agree with you Andrew.
> >
> > Sorry, I should clarify my prior response.  I didn't mean to imply a
> > blind s/LimitedPrivate/Public/g across the whole codebase.  Instead, I'm
> > +1 for the intent of HADOOP-10776: a transition to Public for
> > UserGroupInformation, and by extension the related parts of its API like
> > Credentials.
> >
> > I'm in the camp that generally questions the usefulness of
> > LimitedPrivate, but I agree that transitions to Public need case-by-case
> > consideration.
> >
> > --Chris Nauroth
> >
> > From: Andrew Wang
> > <andrew.wang@cloudera.com<mailto:andrew.wang@cloudera.com>>
> > Date: Tuesday, May 10, 2016 at 2:40 PM
> > To: Chris Nauroth
> > <cnauroth@hortonworks.com<mailto:cnauroth@hortonworks.com>>
> > Cc: Hitesh Shah <hitesh@apache.org<mailto:hitesh@apache.org>>,
> > "yarn-dev@hadoop.apache.org<mailto:yarn-dev@hadoop.apache.org>"
> > <yarn-dev@hadoop.apache.org<mailto:yarn-dev@hadoop.apache.org>>,
> > "mapreduce-dev@hadoop.apache.org<mailto:mapreduce-dev@hadoop.apache.org
> >"
> > <mapreduce-dev@hadoop.apache.org<mailto:mapreduce-dev@hadoop.apache.org
> >>,
> > "common-dev@hadoop.apache.org<mailto:common-dev@hadoop.apache.org>"
> > <common-dev@hadoop.apache.org<mailto:common-dev@hadoop.apache.org>>
> > Subject: Re: [DISCUSS] Treating LimitedPrivate({"MapReduce"}) as Public
> > APIs for YARN applications
> >
> > Why don't we address these on a case-by-case basis, changing the
> > annotations on these key classes to Public? LimitedPrivate{"YARN
> > applications"} is the same thing as Public.
> >
> > This way we don't need to add special exceptions to our compatibility
> > policy. Keeps it simple.
> >
> > Best,
> > Andrew
> >
> > On Tue, May 10, 2016 at 2:26 PM, Chris Nauroth
> > <cnauroth@hortonworks.com<mailto:cnauroth@hortonworks.com>> wrote:
> > +1 for transitioning from LimitedPrivate to Public.
> >
> > I view this as an extension of the need for UserGroupInformation and
> > related APIs to be Public.  Regardless of the original intent behind
> > LimitedPrivate, these are de facto public now, because there is no viable
> > alternative for applications that want to integrate with a secured Hadoop
> > cluster.
> >
> > There is prior discussion of this topic on HADOOP-10776 and HADOOP-12913.
> > HADOOP-10776 is a blocker for 2.8.0 to make the transition to Public.
> >
> > --Chris Nauroth
> >
> >
> >
> >
> > On 5/10/16, 11:34 AM, "Hitesh Shah"
> > <hitesh@apache.org<mailto:hitesh@apache.org>> wrote:
> >
> > >There seems to be some incorrect assumptions on why the application had
> > >an issue. For rolling upgrade deployments, the application bundles the
> > >client-side jars that it was compiled against and uses them in its
> > >classpath and expects to be able to communicate with upgraded servers.
> > >Given that hadoop-common is a monolithic jar, it ends up being used on
> > >both client-side and server-side. The problem in this case was caused by
> > >the fact that the ResourceManager was generating the credentials file
> > >with a format understood only by hadoop-common from 3.x. For an
> > >application compiled against 2.x and has *only* hadoop-common from 2.x
> on
> > >its classpath, trying to read this file fails.
> > >
> > >This is not about whether internal implementations can change for
> > >non-public APIs. The file format for the Credential file in this
> scenario
> > >is *not* internal implementation especially when you can have different
> > >versions of the library trying to read the file. If an older client is
> > >talking to a newer versioned server, the general backward compat
> > >assumption is that the client should receive a response that it can
> parse
> > >and understand. In this scenario, the credentials file provided to the
> > >YARN app by the RM should have been written out with the older version
> or
> > >at the very least been readable by the older hadoop-common.jar.
> > >
> > >In any case, does anyone have any specific concerns with changing
> > >LimitedPrivate({"MapReduce"}) to Public?
> > >
> > >And sure, if we are saying that Hadoop-3.x requires all apps built
> > >against it to go through a full re-compile as well as downtime as
> > >existing apps may no longer work out of the box, lets call it out very
> > >explicitly in the Release notes.
> > >
> > >‹ Hitesh
> > >
> > >> On May 10, 2016, at 9:24 AM, Allen Wittenauer
> > >><allenwittenauer@yahoo.com<mailto:allenwittenauer@yahoo.com>>
wrote:
> > >>
> > >>
> > >>> On May 10, 2016, at 8:37 AM, Hitesh Shah <hitesh@apache.org<mailto:
> hitesh@apache.org>> wrote:
> > >>>
> > >>> There have been various discussions on various JIRAs where upstream
> > >>>projects such as YARN apps ( Tez, Slider, etc ) are called out for
> > >>>using the above so-called Private APIs. A lot of YARN applications
> that
> > >>>have been built out have picked up various bits and pieces of
> > >>>implementation from MapReduce and DistributedShell to get things to
> > >>>work.
> > >>>
> > >>> A recent example is a backward incompatible change introduced ( where
> > >>>the API is not even directly invoked ) in the Credentials class
> related
> > >>>to the ability to read tokens/credentials from a file.
> > >>
> > >>      Let¹s be careful here.  It should be noted that the problem
> happened
> > >>primarily because the application jar appears to have included some
> > >>hadoop jars in them.   So the API invocation isn¹t the problem:  it¹s
> > >>the fact that the implementation under the hood changed.  If the
> > >>application jar didn¹t bundle hadoop jars ‹especially given that were
> > >>already on the classpath--this problem should never have happened.
> > >>
> > >>> This functionality is required by pretty much everyone as YARN
> > >>>provides the credentials to the app by writing the credentials/tokens
> > >>>to a local file which is read in when
> > >>>UserGroupInformation.getCurrentUser() is invoked.
> > >>
> > >>      What you¹re effectively arguing is that implementations should
> never
> > >>change for public (and in this case LimitedPrivate) APIs.  I don¹t
> think
> > >>that¹s reasonable.  Hadoop is filled with changes in major branches
> > >>where the implementations have changed but the internals have been
> > >>reworked to perform the work in a slightly different manner.
> > >>
> > >>> This change breaks rolling upgrades for yarn applications from 2.x
to
> > >>>3.x (whether we end up supporting rolling upgrades across 2.x to 3.x
> is
> > >>>a separate discussion )
> > >>
> > >>
> > >>      At least today, according to the document attached to YARN-666
> (lol),
> > >>rolling upgrades are only supported within the same major version.
> > >>
> > >>>
> > >>> I would like to change our compatibility docs to state that any API
> > >>>that is marked as LimitedPrivate{Mapreduce} implies
> LimitedPrivate{YARN
> > >>>Applications}.
> > >>>
> > >>> Comments/concerns?
> > >>
> > >>
> > >>      a)  That isn¹t good enough.  No one reads the compatibility
> guidelines
> > >>as it is given how many times someone says ³X² isn¹t covered when it
> > >>clearly is.
> > >>
> > >>      b) LimitedPrivate{³YARN Applications²} makes zero sense.  At that
> > >>point it¹s Public and the source should be changed to reflect that.
> > >>Especially given those flags impacts things like how the javadocs are
> > >>generated.
> > >
> > >
> > >---------------------------------------------------------------------
> > >To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> <mailto:common-dev-unsubscribe@hadoop.apache.org>
> > >For additional commands, e-mail: common-dev-help@hadoop.apache.org
> <mailto:common-dev-help@hadoop.apache.org>
> > >
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > common-dev-unsubscribe@hadoop.apache.org<mailto:
> common-dev-unsubscribe@hadoop.apache.org>
> > For additional commands, e-mail:
> > common-dev-help@hadoop.apache.org<mailto:
> common-dev-help@hadoop.apache.org>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message