hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Shvachko <...@yahoo-inc.com>
Subject Re: C API for Hadoop DFS
Date Fri, 28 Apr 2006 17:35:20 GMT

>> I was thinking that we really don't require the mCreationTime field 
>> in the
>> dfsFileInfo struct and instead have mModificationTime. For now,
>> mModificationTime is really the time at which the file got created.
> We can always grow the API later, but it's hard to ever shrink it.  So 
> I'd go with the minimum for now and only include a single time.  I'd 
> suspect application code more often requires a modification time, so 
> let's call it mModificationTime for now.  Does that make sense?
-1 on replacing mCreationTime with mModificationTime.
Supporting modification and access times is hard. Each time a file is 
modified in a directory the
directory's (and all parents') mod/access time should be modified, which 
becomes a bottleneck
whith concurrent access to the namespace.
There should be a very good reason for supporting modification times.
I do not see one.

>> We can have mBlockSize too in the dfsFileInfo struct. This is the value
>> returned by the getBlockSize API considering the current impl of DFS. 
>> But I
>> observed from the mail exchanges that blocksize could be variable in the
>> future.
> The current implementation internally supports variable block sizes 
> within a file, but, for performance, we'll probably move towards a 
> fixed block size for an entire file system.  Again, I think we should 
> aim for the minimal API.  So I'd be happy with just a global per-FS 
> blocksize in the API.  We can always later add per-file blocksizes if 
> we need these, and then interpret the FS blocksize as a default.  
> Application uses of blocksize will be to optimize computations, so if, 
> e.g., a legacy application someday uses the global FS blocksize when 
> per-file blocksizes are supported, then it will still run, just 
> perhaps more slowly when operating on files of non-default block size.


View raw message