hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-1286) Distributed cluster upgrade
Date Sat, 30 Jun 2007 18:49:04 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507057
] 

Konstantin Shvachko edited comment on HADOOP-1286 at 6/30/07 11:47 AM:
-----------------------------------------------------------------------

- UpgradeManager manager determines which upgrade objects are required for the distributed
upgrade.
Each upgrade object performs necessary upgrade until done and returns control to the upgrade
manager.

- ClientProtocol needs to have a method that returns current status of the distributed upgrade,
which should be
reported by DFSAdmin.report().

- Only the DatanodeProtocol should be extended with  processUpgradeCommand().
ClientProtocol protocol should not have it. It should always obey the safe mode restrictions.

- Each UpgradeObject should correspond to a specific version rather than to a new-old version
pair as stated before.
So the UpgradeObjectCollection table rather looks as
|| Version || class names ||
| v1 | NameUpgradeObject1, DataUpgradeObject1 |
| v2 | NameUpgradeObject2, DataUpgradeObject2 |
where v1 < v2 < ...
If we need to upgrade from vX to vY we find all versions vi such that vX < vi < vY and
perform corresponding
upgrades for each of them in that order.
This particularly means that Upgradeable interface should have just one method getVersion()
instead of the two
previously declared versionFrom() and versionTo().


 was:
- UpgradeManager manager determines which upgrade objects are required for the distributed
upgrade.
Each upgrade object performs necessary upgrade until done and returns control to the upgrade
manager.

- ClientProtocol needs to have a method that returns current status of the distributed upgrade,
which should be
reported by DFSAdmin.report().

- Only the DatanodeProtocol should be extended with  processUpgradeCommand().
ClientProtocol protocol should not have it. It should always obey the safe mode restrictions.

- Each UpgradeObject should correspond to a specific object rather than to a new-old version
pair as stated before.
So the UpgradeObjectCollection table rather looks as
|| Version || class names ||
| v1 | NameUpgradeObject1, DataUpgradeObject1 |
| v2 | NameUpgradeObject2, DataUpgradeObject2 |
where v1 < v2 < ...
If we need to upgrade from vX to vY we find all versions vi such that vX < vi < vY and
perform corresponding
upgrades for each of them in that order.
This particularly means that Upgradeable interface should have just one method getVersion()
instead of the two
previously declared versionFrom() and versionTo().

> Distributed cluster upgrade
> ---------------------------
>
>                 Key: HADOOP-1286
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1286
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>         Attachments: DistUpgradeFramework.patch, Upgradeable.java
>
>
> Some data layout changes in HDFS require more than just a version upgrade introduced
in HADOOP-702,
> because the cluster can function properly only when all components have upgraded, and
the components
> need to communicate to each other and exchange data before they can perform the upgrade.
> The CRC upgrade discussed in HADOOP-1134 is one of such examples. Future enhancements
like
> implementation of appends can change block meta-data and may require distributed upgrades.
> Distributed upgrade (DU) starts with a version upgrade (VU) so that at any time one could
rollback
> all changes and start over.
> When VU is finished the name-node enters safe mode and persistently records that the
DU have been started.
> It will also need to write a record when DU is finished. This is necessary to report
unfinished upgrades in case
> of failure or for monitoring.
> The actual upgrade code from version vO to vN should be implemented in a separate UpgradeObject
class,
> which implements interface Upgradeable.
> We create a new UpgradeObject for each pair of versions vO to vN that require a DU.
> We keep a (hard coded) table that determines which UpgradeObject(s) are applicable for
the version pairs.
> Something like:
> || Old version || New version || class names ||
> | vO1 | vN1 | NameUpgradeObject1, DataUpgradeObject1 |
> | vO2 | vN2 | NameUpgradeObject2, DataUpgradeObject2 |
> where vO1 < vN1 < vO2 < vN2 ...
> Now, if we need to upgrade from version version vX to version vY, we look for all pairs
<vOi, vNi>
> in the table such that vX < vOi < vNi < vY and perform corresponding DUs one
after another as they appear in the table.
> Each DU can and most probably should contain multiple UpgradeObjects.
> I'd define one object for the name-node and one for the data-nodes.
> The upgrade objects (in the same row) can communicate to each other either via existing
protocols or using
> temporary protocols defined exclusively for this particular upgrade objects.
> I envision that some DUs will need to use old  (vO) protocols to exchange the pre-upgrade
data,
> and new (vN) protocols to reoport the upgraded data.
> UpgradeObjects should be able to bypass safe mode restrictions, be able to +modify+ name-node
data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message