hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-2002) Eliminate redundant searches in the namespace directory tree.
Date Sat, 06 Oct 2007 01:54:51 GMT
Eliminate redundant searches in the namespace directory tree.
-------------------------------------------------------------

                 Key: HADOOP-2002
                 URL: https://issues.apache.org/jira/browse/HADOOP-2002
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.13.0
            Reporter: Konstantin Shvachko
             Fix For: 0.16.0


There is no need to look for the same INode multiple times in the same name-node operation.
For example in FSNamesystem.exists()
{code}
  public boolean exists(String src) {
    if (dir.getFileBlocks(src) != null || dir.isDir(src)) {
      return true;
    } else {
      return false;
    }
  }
{code}
both getFileBlocks() and isDir() call rootDir.getNode(src) inside, which causes two separate
lookups in the directory tree while one is enough.
 Why not check whether the inode is a directory as well as that it has blocks at the same
time.
Other methods do the same thing.
- completeFile() calls getINode in different parts at least 3 times.
- getAdditionalBlock() - 2 getINode calls
- startFile() - I counted 5 calls, may be missed some.

In order to prevent that we should define all methods beyond the top level based on INode
parameters rather than path names. 
E.g. all FSDirectory methods should take INode as a parameter, not the String.
We should be careful though not to use INode across separate synchronized sections. 
Once the lock is released the INode should be accessed by the path again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message