hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-519) HDFS File API should be extended to include positional read
Date Tue, 19 Sep 2006 00:15:23 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-519?page=comments#action_12435614 ] 
            
Doug Cutting commented on HADOOP-519:
-------------------------------------

> Because the normal read method has some other deficiencies as well.

I still don't see the point in having two identical copies of these deficiencies.

Is there a follow-on issue that you have in mind that will redesign the block reading protocol?

> HDFS File API should be extended to include positional read
> -----------------------------------------------------------
>
>                 Key: HADOOP-519
>                 URL: http://issues.apache.org/jira/browse/HADOOP-519
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.6.0
>         Environment: All
>            Reporter: Milind Bhandarkar
>         Assigned To: Milind Bhandarkar
>             Fix For: 0.7.0
>
>         Attachments: pread.patch
>
>
> HDFS Input streams should support positional read. Positional read (such as the pread
syscall on linux) allows reading for a specified offset without affecting the current file
offset. Since the underlying file state is not touched, pread can be used efficiently in multi-threaded
programs.
> Here is how I plan to implement it.
> Provide PositionedReadable interface, with the following methods:
> int read(long position, byte[] buffer, int offset, int length);
> void readFully(long position, byte[] buffer, int offset, int length);
> void readFully(long position, byte[] buffer);
> Abstract class FSInputStream would provide default implementation of the above methods
using getPos(), seek() and read() methods. The default implementation is inefficient in multi-threaded
programs since it locks the object while seeking, reading, and restoring to old state.
> DFSClient.DFSInputStream, which extends FSInputStream will provide an efficient non-synchronized
implementation for above calls.
> In addition, FSDataInputStream, which is a wrapper around FSInputStream, will provide
wrapper methods for above read methods as well.
> Patch forthcoming early next week.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message