hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Baldeschwieler <eri...@yahoo-inc.com>
Subject Re: [jira] Created: (HADOOP-713) dfs list operation is too expensive
Date Wed, 15 Nov 2006 03:09:32 GMT
So let's display nothing for now and revisit this once we have a  
cleaner CRC story.


On Nov 14, 2006, at 10:55 AM, Hairong Kuang wrote:

> Setting the size of a directory to be the # of files is a good  
> idea. But the
> problem is that dfs name node has no idea of checksum files. So the  
> number
> of files include that of checksum files. But what's displayed at  
> the client
> side has filtered out the checksum files. So the # of files does  
> not match
> what's really displayed at the client side.
>
> Hairong
>
> -----Original Message-----
> From: Arkady Borkovsky [mailto:arkady@yahoo-inc.com]
> Sent: Monday, November 13, 2006 5:07 PM
> To: hadoop-dev@lucene.apache.org
> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation is too
> expensive
>
> When listing a directory, for directory entries it may be more  
> useful to
> display the number of files in a directory, rather than the number  
> of bytes
> used by all the files in the directory and its subdirectories.
> This a subjective opinion -- comments?
>
> (Currently, the value displayed subdirectory is "0")
>
> On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:
>
>> dfs list operation is too expensive
>> -----------------------------------
>>
>>                  Key: HADOOP-713
>>                  URL: http://issues.apache.org/jira/browse/HADOOP-713
>>              Project: Hadoop
>>           Issue Type: Improvement
>>           Components: dfs
>>     Affects Versions: 0.8.0
>>             Reporter: Hairong Kuang
>>
>>
>> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo
>> of a directory contains a field called contentsLen, indicating its
>> size  which gets computed at the namenode side by resursively going
>> through its subdirs. At the same time, the whole dfs directory  
>> tree is
>> locked.
>>
>> The list operation is used a lot by DFSClient for listing a  
>> directory,
>> getting a file's size and # of replicas, and getting the size of dfs.
>> Only the last operation needs the field contentsLen to be computed.
>>
>> To reduce its cost, we can add a flag to the list request.  
>> ContentsLen
>> is computed If the flag is set. By default, the flag is false.
>>
>> --
>> This message is automatically generated by JIRA.
>> -
>> If you think it was sent incorrectly contact one of the
>> administrators:
>> http://issues.apache.org/jira/secure/Administrators.jspa
>> -
>> For more information on JIRA, see:
>> http://www.atlassian.com/software/jira
>>
>>
>
>


Mime
View raw message