hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vishwajeet Dusane (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-12876) [Azure Data Lake] Support for process level FileStatus cache to optimize GetFileStatus frequent opeations
Date Thu, 03 Mar 2016 11:24:18 GMT
Vishwajeet Dusane created HADOOP-12876:
------------------------------------------

             Summary: [Azure Data Lake] Support for process level FileStatus cache to optimize
GetFileStatus frequent opeations
                 Key: HADOOP-12876
                 URL: https://issues.apache.org/jira/browse/HADOOP-12876
             Project: Hadoop Common
          Issue Type: New Feature
          Components: fs, fs/azure, tools
            Reporter: Vishwajeet Dusane
            Assignee: Vishwajeet Dusane


Add support to cache GetFileStatus and ListStatus response locally for limited period of time.
Local cache for limited period of time would optimize number of calls for GetFileStatus operation.
One of the example  where local limited period cache would be useful - terasort ListStatus
on input directory follows with GetFileStatus operation on each file within directory. For
2048 input files in a directory would save 2048 GetFileStatus calls during start up (Using
the ListStatus response to cache FileStatus instances).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message