hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yiqun Lin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13250) RBF: Router to manage requests across multiple subclusters
Date Fri, 16 Mar 2018 08:21:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401584#comment-16401584

Yiqun Lin commented on HDFS-13250:

Thanks [~elgoiri] for updating the patch!

I didn't quite get the comment about getFileInfoAll(), we first check for the number of directories
and then we check for the files separately, do you mean in 1165 to check if it's a file?
I just thinking for the case of not all of them are directories in {{getFileInfoAll}}. If
some of them are directories and some others are files, current logic will return null. But
actually it can still the first file. The logic can changed to following:
  private HdfsFileStatus getFileInfoAll(final List<RemoteLocation> locations,
      final RemoteMethod method) throws IOException {

    // Get the file info from everybody
    Map<RemoteLocation, HdfsFileStatus> results =
        rpcClient.invokeConcurrent(locations, method, HdfsFileStatus.class);

    // Check how many subclusters have the file and how many are directories
    int numDirs = 0;

    // If not a directory or all of them are a directory, return first
    if (numDirs == 0 || numDirs == locations.size()) {
      for (RemoteLocation loc : locations) {
        HdfsFileStatus fileStatus = results.get(loc);
        if (fileStatus != null) {
          return fileStatus;
    } else {
      // If some of them are directory, others are files, returns first file.
      for (RemoteLocation loc : locations) {
        HdfsFileStatus fileStatus = results.get(loc);
        if (fileStatus.isFile()) {
          return fileStatus;
    return null;

Others look good to me.

> RBF: Router to manage requests across multiple subclusters
> ----------------------------------------------------------
>                 Key: HDFS-13250
>                 URL: https://issues.apache.org/jira/browse/HDFS-13250
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Íñigo Goiri
>            Assignee: Íñigo Goiri
>            Priority: Major
>         Attachments: HDFS-13250.000.patch, HDFS-13250.001.patch, HDFS-13250.002.patch
> HDFS-13124 introduces the concept of mount points spanning multiple subclusters. The
Router should distribute the requests across these subclusters.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message