hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8209) Add option to enable DN and TT rolling upgrades in branch-1
Date Wed, 11 Apr 2012 19:01:17 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251843#comment-13251843
] 

Aaron T. Myers commented on HADOOP-8209:
----------------------------------------

Patch looks pretty good to me, Eli. Just a few small comments:

# Not obvious to me why we have these static version methods in the Storage class, which themselves
just delegate to static methods of the VersionInfo class.
# Recommend adding additional detail to the AssertionErrors, including the revisions and versions
that didn't match.
# Recommend adding an explanation to the DN log message about why the communication is being
allowed, e.g.: "... because versions match exactly ('" + version + "') and hadoop.relaxed.worker.version.check
is enabled." Ditto for TT.
# Similarly the log message explaining why communication isn't being allowed might mention
whether the check failed because of strict revision checking, or relaxed version checking.
# Why call the new method "getInfoVersion" in JobTracker? getVersion, as was done in Storage,
seems to make more sense to me.
# In TestTaskTrackerVersionCheck#testDefaultVersionCheck, I don't think you actually test
that different revisions are still disallowed by default, since you change both the revision
and version simultaneously in the test.
                
> Add option to enable DN and TT rolling upgrades in branch-1
> -----------------------------------------------------------
>
>                 Key: HADOOP-8209
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8209
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: hadoop-8209.txt
>
>
> In 1.x DNs currently refuse to connect to NNs if their build *revision* (ie svn revision)
do not match. TTs refuse to connect to JTs if their build *version* (version, revision, user,
and source checksum) do not match.
> This prevents rolling upgrades, which is intentional, see the discussion in HADOOP-5203.
The primary motivation in that jira was (1) it's difficult to guarantee every build on a large
cluster got deployed correctly, builds don't get rolled back to old versions by accident etc,
and (2) mixed versions can lead to execution problems that are hard to debug.
> However there are also cases when users know they two builds are compatible, eg when
deploying a new build which contains the same contents as the previous one, plus a critical
security patch that does not affect compatibility. Currently deploying a 1 line patch requires
taking down the entire cluster (or trying to work around the issue by lying about the build
revision or checksum, yuck). These users would like to be able to perform a rolling upgrade.
> In order to support this, let's add an option that is off by default, but, when enabled,
makes the DN and TT version check just check for an exact version match (eg "1.0.2") but ignore
the build revision (DN) and the source checksum (TT). Two builds still need to match the major,
minor, and point numbers, but nothing else.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message