hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4157) libhdfs: hdfsTell could be implemented a smarter than it is
Date Wed, 07 Nov 2012 22:26:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492763#comment-13492763
] 

Todd Lipcon commented on HDFS-4157:
-----------------------------------

bq. In libhdfs, hdfsTell currently makes an RPC to the DataNode to determine the position
of the stream

Howso? The implementation of {{getPos()}} just returns the cached position in DFSInputStream.
                
> libhdfs: hdfsTell could be implemented a smarter than it is
> -----------------------------------------------------------
>
>                 Key: HDFS-4157
>                 URL: https://issues.apache.org/jira/browse/HDFS-4157
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: libhdfs
>    Affects Versions: 2.0.3-alpha
>            Reporter: Colin Patrick McCabe
>            Priority: Minor
>
> In libhdfs, {{hdfsTell}} currently makes an RPC to the {{DataNode}} to determine the
position of the stream.  However, we could cache this information easily, since libhdfs controls
access to the stream.  This would avoid the double overhead of JNI and the RPC itself.
> This would be very helpful for {{fuse_dfs}}, since that program calls {{hdfsTell}} before
every {{write}} or {{read}} operation.  This can be quite a lot of overhead, since writes
may be as small as 4kb (depends on FUSE configuration, kernel version, etc.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message