hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5223) Allow edit log/fsimage format changes without changing layout version
Date Tue, 05 May 2015 21:25:03 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529313#comment-14529313

Chris Nauroth commented on HDFS-5223:

Hi [~atm].

bq. Seems like this approach would certainly help with the downgrade/rollback issue, but wouldn't
do much to make the upgrade itself easier.

That's correct.  The rolling upgrade procedure still would be required.  This document/patch
focuses on expanding the uses cases that can support downgrade.

bq. In general I think it'd be beneficial for HDFS to move toward a bit-set denoting which
features/op codes are enabled/disabled, much like Todd Lipcon described earlier.

I share some of the concerns mentioned earlier about operational complexity.


Complexity in HDFS often arises from combinations of its features rather than individual features
in isolation.  If individual features can be toggled, then no two HDFS instances running the
same software version are really guaranteed to be alike.  This becomes another layer of troubleshooting
required for a technical support team.  Testing the possible combinations of features on and
off becomes a combinatorial explosion that's difficult for a QA team to manage.

Aside from managing metadata upgrades, we've also found rolling upgrade to be valuable because
of the OOB ack propagated through write pipelines (HDFS-5583) to tell clients to pause rather
than aborting the connection.  Even if it wasn't required from a metadata standpoint, some
users might continue to use rolling upgrade to get this benefit, even within a minor release
line where the layout version hasn't changed.  Considering that use case, I see value in improving
our ability to downgrade within the current rolling upgrade scheme.

If you prefer to keep the discussion here focused on building consensus around feature flags,
then I could potentially move this work to a separate jira where it could move ahead independently.
 Let me know your thoughts.  Thanks!

> Allow edit log/fsimage format changes without changing layout version
> ---------------------------------------------------------------------
>                 Key: HDFS-5223
>                 URL: https://issues.apache.org/jira/browse/HDFS-5223
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.1.1-beta
>            Reporter: Aaron T. Myers
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5223-HDFS-Downgrade-Extended-Support.pdf, HDFS-5223.004.patch,
> Currently all HDFS on-disk formats are version by the single layout version. This means
that even for changes which might be backward compatible, like the addition of a new edit
log op code, we must go through the full `namenode -upgrade' process which requires coordination
with DNs, etc. HDFS should support a lighter weight alternative.
> Copied description from HDFS-8075 which is a duplicate and now closed. (by sanjay on
APril 7 2015)
> Background
> * HDFS image layout was changed to use Protobufs to allow easier forward and backward
> * Hdfs has a layout version which is changed on each change (even if it an  optional
protobuf field was added).
> * Hadoop supports two ways of going back during an upgrade:
> **  downgrade: go back to old binary version but use existing image/edits so that newly
created files are not lost
> ** rollback: go back to "checkpoint" created before upgrade was started - hence newly
created files are lost.
> Layout needs to be revisited if we want to support downgrade is some circumstances which
we dont today. Here are use cases:
> * Some changes can support downgrade even though they was a change in layout since there
is not real data loss but only loss of new functionality. E.g. when we added ACLs one could
have downgraded - there is no data loss but you will lose the newly created ACLs. That is
acceptable for a user since one does not expect to retain the newly added ACLs in an old version.
> * Some changes may lead to data-loss if the functionality was used. For example, the
recent truncate will cause data loss if the functionality was actually used. Now one can tell
admins NOT use such new such new features till the upgrade is finalized in which case one
could potentially support downgrade.
> * A fairly fundamental change to layout where a downgrade is not possible but a rollback
is. Say we change the layout completely from protobuf to something else. Another example is
when HDFS moves to support partial namespace in memory - they is likely to be a fairly fundamental
change in layout.

This message was sent by Atlassian JIRA

View raw message