hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rakesh R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13025) [SPS]: Implement a mechanism to scan the files for external SPS
Date Mon, 22 Jan 2018 13:00:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334224#comment-16334224

Rakesh R commented on HDFS-13025:

Thanks [~umamaheswararao] for the quick turnaround. {{TestExternalStoragePolicySatisfier}}
unit testing idea is awesome. Added few comments, please take care.

# Please add few details about the reason to ignore like, {{@Ignore("reason why this test
is skipping")}}
# Add javadoc for {{boolean scanCompleted}} BlockStorageMovementNeeded#addAll() and BlockStorageMovementNeeded#add()
methods. Aslo, please remove unwanted {{@param startId}} from BlockStorageMovementNeeded#add()
# Can we avoid duplicate codes. How about create a method and reuse like below, 
  // add javadoc
  public synchronized void addAll(long startId,
      List<ItemInfo> itemInfoList, boolean scanCompleted) {
    add(startId, scanCompleted);

  // add javadoc
  public synchronized void add(ItemInfo itemInfo, boolean scanCompleted) {
    add(itemInfo.getStartId(), scanCompleted);

  private void add(long startId, boolean scanCompleted) {
    DirPendingWorkInfo pendingWork =
    if (pendingWork == null) {
      pendingWork = new DirPendingWorkInfo();
      pendingWorkForDirectory.put(startId, pendingWork);
    if (scanCompleted) {
# Below {{service.addAllFileIdsToProcess}} call is going to {{BlockStorageMovementNeeded#addAll}}
then add an entry to {{pendingWorkForDirectory.put(startId, pendingWork);}}. If the InodeId
is a file, should we need to add into the {{pendingWorkForDirectory}} and mark as scanned?
         ...// other logic
		 ...// other logic
         service.addAllFileIdsToProcess(startInode.getId(), currentBatch, true);
# Please add annotation {{@InterfaceAudience.Private}} to IntraSPSNameNodeFileIdCollector,
ExternalSPSFileIDCollector & other missing classes, if any.
# If {{dfs}} is not created then skip #processPath and #scanAndCollectFileIds to avoid NPE.
I remember we discussed about retries, probably we could add null check {{dfs != null}} before
using dfs. // TODO about retry mechanism now as a reminder and later we could implement this
post branch merge.

LOG.error("Unable to get the filesystem. Make sure Namenode running and "
          + "configured namenode address is correct.", e);
# Please add debug log message here saying, no children for this directory.
ExternalSPSFileIDCollector .java
85	      if (children == null) {
86	        return;
87	      }

> [SPS]: Implement a mechanism to scan the files for external SPS
> ---------------------------------------------------------------
>                 Key: HDFS-13025
>                 URL: https://issues.apache.org/jira/browse/HDFS-13025
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>            Priority: Major
>         Attachments: HDFS-13025-HDFS-10285-0.patch, HDFS-13025-HDFS-10285-1.patch
> HDFS-12911 modularization is introducing FileIDCollector interface for scanning the
files. That will help us to plugin different ways of scanning mechanisms if needed.
> For Internal SPS, we have INode based scanning. For external SPS, we should go via client
API scanning as we can not access NN internal structures. 
> This is the task to implement the scanning plugin for external SPS.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message