hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gangumalla, Uma" <uma.ganguma...@intel.com>
Subject Re: [DISCUSS] Merge Storage Policy Satisfier (SPS) [HDFS-10285] feature branch to trunk
Date Fri, 28 Jul 2017 06:46:42 GMT
Hi Andrew, Thanks a lot for reviewing.

Your understanding on the 2 factors are totally right. More than 90% of code was newly added
and very less portion of existing code was touched, that is for NN RPCs and DN messages. We
can see that in combined patch stats ( only 45 lines with "-“ )

> If there are still plans to make changes that affect compatibility (the hybrid RPC and
bulk DN work mentioned sound like they would), then we can cut branch-3 first, or wait to
merge until after these tasks are finished.
[Uma] We don’t see that 2 items as high priority for the feature. Users would be able to
use the feature with current code base and API. So, we would consider them after branch-3
only. That should be perfectly fine IMO. The current API is very much useful for Hbase scenario.
In Hbase case, they will rename files under to different policy directory. They will not set
the policies always. So, when rename files under to different policy directory, they can simply
call satisfyStoragePolicy, they don’t need any hybrid API.

>* Possible impact when this feature is disabled
[Uma] Related to this point, I wanted to highlight about dynamic activation and deactivation
of the feature.That means, without restarting Namenode, feature can be disabled/enabled.
If feature is disabled, there should be 0 impact. As we have dynamic enabling feature, we
will not even initialize threads if feature is disabled. The service will be initialized when
enabled. For easy review, please look at the last section in this documentation ArchivalStorage.html<https://issues.apache.org/jira/secure/attachment/12877327/ArchivalStorage.html>

<https://issues.apache.org/jira/secure/attachment/12877327/ArchivalStorage.html>
Also Tiered storage + hdfs mounts solution wants to use SPS feature. https://issues.apache.org/jira/browse/HDFS-12090
. So, having this SPS upstream would allow HDFS-12090( dependent) feature to proceed.(I don’t
say, we have to merge because of this reason alone, but I would just like to mention about
it as an endorsement to the feature. :-) )

Regards,
Uma

From: Andrew Wang <andrew.wang@cloudera.com<mailto:andrew.wang@cloudera.com>>
Date: Thursday, July 27, 2017 at 12:15 PM
To: Uma Gangumalla <uma.gangumalla@intel.com<mailto:uma.gangumalla@intel.com>>
Cc: "hdfs-dev@hadoop.apache.org<mailto:hdfs-dev@hadoop.apache.org>" <hdfs-dev@hadoop.apache.org<mailto:hdfs-dev@hadoop.apache.org>>
Subject: Re: [DISCUSS] Merge Storage Policy Satisfier (SPS) [HDFS-10285] feature branch to
trunk

Hi Uma, Rakesh,

First off, I like the idea of this feature. It'll definitely make HSM easier to use.

With my RM hat on, I gave the patch a quick skim looking for:

* Possible impact when this feature is disabled
* API stability and other compat concerns

At a high-level, it looks like it uses xattrs rather than new edit log ops to track files
being moved. Some new NN RPCs and DN messages added to interact with the feature. Almost entirely
new code that doesn't modify the guts of HDFS much.

Could you comment further on these two concerns? We're closing in on 3.0.0-beta1, so the merge
of any large amount of new code makes me wary. If there are still plans to make changes that
affect compatibility (the hybrid RPC and bulk DN work mentioned sound like they would), then
we can cut branch-3 first, or wait to merge until after these tasks are finished.

Best,
Andrew



On Mon, Jul 24, 2017 at 11:35 PM, Gangumalla, Uma <uma.gangumalla@intel.com<mailto:uma.gangumalla@intel.com>>
wrote:
Dear All,

I would like to propose Storage Policy Satisfier(SPS) feature merge into trunk. We have been
working on this feature from last several months. This feature received the contributions
from different companies. All of the feature development happened smoothly and collaboratively
in JIRAs.

Detailed design document is available in JIRA: Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf<https://issues.apache.org/jira/secure/attachment/12873642/Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf>
Test report attached to JIRA: HDFS-SPS-TestReport-20170708.pdf<https://issues.apache.org/jira/secure/attachment/12876256/HDFS-SPS-TestReport-20170708.pdf>

Short Description of the feature:-
   Storage Policy Satisfier feature is to aim the distributed HDFS applications to schedule
the block movements easily.
   When storage policy change happened, user can invoke the satisfyStoragePolicy api to trigger
the block storage movements.
   Block movement tasks will be assigned to datanodes and movements will happen distributed
fashion.
   Block level movement tracking also has been distributed to Dns to avoid the load on Namenodes.
   A co-ordinator Datanode tracks all the blocks associated to a blockCollection and send
the consolidated final results to Namenode.
   If movement result is failure, Namenode will re-schedule the block movements.

Development branch is: HDFS-10285
No of JIRAs Resolved: 38
Pending JIRAs: 4 (I don’t think they are blockers for merge)

We have posted combined patch for easy merge reviews. Jenkins job test results looking good
on the combined patch.
Quick stats on combined Patch:
  67 files changed, 7001 insertions(+), 45 deletions(-)
  Added/modified testcases= ~70


Thanks to all helpers namely Andrew Wang, Anoop Sam John, Du Jingcheng , Ewan Higgs, Jing
Zhao, Kai Zheng,  Rakesh R, Ramakrishna , Surendra Singh Lilhore , Uma Maheswara Rao G, Wei
Zhou , Yuanbo Liu. Without these members effort, this feature might not have reached to this
state.

We will continue work on the following future work items:

  1.  Presently user has to do set & satisfy policy in separate RPC calls. The idea is
to provide a hybrid API dfs#setStoragePolicy(src, policy) which should do set and satisfy
in one RPC call to namenode (Reference HDFS-11669)
  2.  Presently BlockStorageMovementCommand sends all the blocks under a trackID over single
heartbeat response. If blocks are many under a given trackID (For example: a file contains
many blocks) then that bulk information goes to DN in a single network call and come with
a lot of overhead. One idea is to Use smaller batches of BlockMovingInfo into the block storage
movement command (Reference HDFS-11125)
  3.  Build a mechanism to throttle the number of concurrent moves at the datanode.
  4.  Allow specifying initial delay in seconds before the source file is scheduled for satisfying
the storage policy. For example in HBase, the interval between archive (move files between
different storages) and delete file is not large. In that case it may not be required to immediately
scheduling satisfy policy task.
  5.  SPS related metrics to be covered.

So, I feel this branch is ready for merge into trunk. Please provide your feedbacks. If there
are no objections, I will proceed for voting.

Regards,
Uma & Rakesh


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message