hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Milind Bhandarkar (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-519) HDFS File API should be extended to include positional read
Date Mon, 18 Sep 2006 22:28:23 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-519?page=comments#action_12435595 ] 
            
Milind Bhandarkar commented on HADOOP-519:
------------------------------------------

>In DFSClient, you've duplicated a lot of code from blockSeekTo in fetchBlockByteRange.
>Can you perhaps instead add one or two more methods that capture this common code?

Thats what I plan to do in another patch later, when I fix the read implementation and implement
pipelining in writes.

> The javadoc for read(long,byte[], int,int) should say "read up to" or "attempt to read",
>since it may not read all of the bytes (that's what readFully is for). 

Yes, I will make that modification.

>The javadoc comments on the new FSInputStream methods do not add anything useful to
>what would be inherited from the interface, and what they do add makes them inappropriate
>for inheritance by subclasses. So these should be removed.

You mean, just the commenst should be removed or the methods ? Methods cannot be implemented
by making PositionedReadable an abstract class, because FSDataInputStream already extends
an abstract class.

> HDFS File API should be extended to include positional read
> -----------------------------------------------------------
>
>                 Key: HADOOP-519
>                 URL: http://issues.apache.org/jira/browse/HADOOP-519
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.6.0
>         Environment: All
>            Reporter: Milind Bhandarkar
>         Assigned To: Milind Bhandarkar
>             Fix For: 0.7.0
>
>         Attachments: pread.patch
>
>
> HDFS Input streams should support positional read. Positional read (such as the pread
syscall on linux) allows reading for a specified offset without affecting the current file
offset. Since the underlying file state is not touched, pread can be used efficiently in multi-threaded
programs.
> Here is how I plan to implement it.
> Provide PositionedReadable interface, with the following methods:
> int read(long position, byte[] buffer, int offset, int length);
> void readFully(long position, byte[] buffer, int offset, int length);
> void readFully(long position, byte[] buffer);
> Abstract class FSInputStream would provide default implementation of the above methods
using getPos(), seek() and read() methods. The default implementation is inefficient in multi-threaded
programs since it locks the object while seeking, reading, and restoring to old state.
> DFSClient.DFSInputStream, which extends FSInputStream will provide an efficient non-synchronized
implementation for above calls.
> In addition, FSDataInputStream, which is a wrapper around FSInputStream, will provide
wrapper methods for above read methods as well.
> Patch forthcoming early next week.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message