hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9441) Do not construct path string when choosing block placement targets
Date Mon, 07 Dec 2015 22:38:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15045904#comment-15045904
] 

Tsz Wo Nicholas Sze commented on HDFS-9441:
-------------------------------------------

> ... relying on BlockCollection.toString() to return a path, will cement toString() as
a formal API for placement. ...

We are not going to change toString(), which currently returns the local name for INode, to
return full path.  The block placement implementations need to call getName() if they want
to get the full path of a BlockCollection.

A better API is to change the type of src to a new interface (let's call it GetName for this
discussion).
{code}
  interface GetName {
    String getName();
  }
{code}
Then, we may pass String and BlockCollection by creating anonymous class objects, i.e.
- String
{code}
  String src;
  GetName name = new GetName() {
    @Override
    public String getName() {
      return src;
    }
  };
{code}
- BlockCollection
{code}
  BlockCollection bc;
  GetName name = new GetName() {
    @Override
    public String getName() {
      return bc.getName();
    }
  };
{code}

However, it seems an overdesign.

> Do not construct path string when choosing block placement targets
> ------------------------------------------------------------------
>
>                 Key: HDFS-9441
>                 URL: https://issues.apache.org/jira/browse/HDFS-9441
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Tsz Wo Nicholas Sze
>            Priority: Minor
>         Attachments: h9441_20151118.patch, h9441_20151119.patch
>
>
> - INodeFile.getName() is expensive since it involves quite a few string operations. 
The method is called in both ReplicationWork and ErasureCodingWork but the default BlockPlacementPolicy
does not use the returned string.  We should simply pass BlockCollection to reduce unnecessary
computation when using the default BlockPlacementPolicy.
> - Another improvement: the return type of FSNamesystem.getBlockCollection should be changed
to INodeFile since it always returns an INodeFile object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message