hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hitesh Shah <hit...@apache.org>
Subject Re: [DISCUSS] Treating LimitedPrivate({"MapReduce"}) as Public APIs for YARN applications
Date Tue, 10 May 2016 18:34:27 GMT
There seems to be some incorrect assumptions on why the application had an issue. For rolling
upgrade deployments, the application bundles the client-side jars that it was compiled against
and uses them in its classpath and expects to be able to communicate with upgraded servers.
Given that hadoop-common is a monolithic jar, it ends up being used on both client-side and
server-side. The problem in this case was caused by the fact that the ResourceManager was
generating the credentials file with a format understood only by hadoop-common from 3.x. For
an application compiled against 2.x and has *only* hadoop-common from 2.x on its classpath,
trying to read this file fails.  

This is not about whether internal implementations can change for non-public APIs. The file
format for the Credential file in this scenario is *not* internal implementation especially
when you can have different versions of the library trying to read the file. If an older client
is talking to a newer versioned server, the general backward compat assumption is that the
client should receive a response that it can parse and understand. In this scenario, the credentials
file provided to the YARN app by the RM should have been written out with the older version
or at the very least been readable by the older hadoop-common.jar.

In any case, does anyone have any specific concerns with changing LimitedPrivate({"MapReduce"})
to Public?

And sure, if we are saying that Hadoop-3.x requires all apps built against it to go through
a full re-compile as well as downtime as existing apps may no longer work out of the box,
lets call it out very explicitly in the Release notes. 

— Hitesh

> On May 10, 2016, at 9:24 AM, Allen Wittenauer <allenwittenauer@yahoo.com> wrote:
>> On May 10, 2016, at 8:37 AM, Hitesh Shah <hitesh@apache.org> wrote:
>> There have been various discussions on various JIRAs where upstream projects such
as YARN apps ( Tez, Slider, etc ) are called out for using the above so-called Private APIs.
A lot of YARN applications that have been built out have picked up various bits and pieces
of implementation from MapReduce and DistributedShell to get things to work.
>> A recent example is a backward incompatible change introduced ( where the API is
not even directly invoked ) in the Credentials class related to the ability to read tokens/credentials
from a file.
> 	Let’s be careful here.  It should be noted that the problem happened primarily because
the application jar appears to have included some hadoop jars in them.   So the API invocation
isn’t the problem:  it’s the fact that the implementation under the hood changed.  If
the application jar didn’t bundle hadoop jars —especially given that were already on the
classpath--this problem should never have happened.
>> This functionality is required by pretty much everyone as YARN provides the credentials
to the app by writing the credentials/tokens to a local file which is read in when UserGroupInformation.getCurrentUser()
is invoked.
> 	What you’re effectively arguing is that implementations should never change for public
(and in this case LimitedPrivate) APIs.  I don’t think that’s reasonable.  Hadoop is filled
with changes in major branches where the implementations have changed but the internals have
been reworked to perform the work in a slightly different manner.
>> This change breaks rolling upgrades for yarn applications from 2.x to 3.x (whether
we end up supporting rolling upgrades across 2.x to 3.x is a separate discussion )
> 	At least today, according to the document attached to YARN-666 (lol), rolling upgrades
are only supported within the same major version.  
>> I would like to change our compatibility docs to state that any API that is marked
as LimitedPrivate{Mapreduce} implies LimitedPrivate{YARN Applications}.
>> Comments/concerns? 
> 	a)  That isn’t good enough.  No one reads the compatibility guidelines as it is given
how many times someone says “X” isn’t covered when it clearly is.
> 	b) LimitedPrivate{“YARN Applications”} makes zero sense.  At that point it’s Public
and the source should be changed to reflect that.  Especially given those flags impacts things
like how the javadocs are generated.

To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

View raw message