hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4157) libhdfs: hdfsTell could be implemented a smarter than it is
Date Wed, 07 Nov 2012 22:44:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492779#comment-13492779
] 

Colin Patrick McCabe commented on HDFS-4157:
--------------------------------------------

Unfortunately, there's a pretty heavy overhead making a JNI call, especially when you look
up the Java function dynamically like we do.  Actually, this is one thing that libwebhdfs
does better than libhdfs.
                
> libhdfs: hdfsTell could be implemented a smarter than it is
> -----------------------------------------------------------
>
>                 Key: HDFS-4157
>                 URL: https://issues.apache.org/jira/browse/HDFS-4157
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: libhdfs
>    Affects Versions: 2.0.3-alpha
>            Reporter: Colin Patrick McCabe
>            Priority: Minor
>
> In libhdfs, {{hdfsTell}} currently makes an RPC to the {{DataNode}} to determine the
position of the stream.  However, we could cache this information easily, since libhdfs controls
access to the stream.  This would avoid the double overhead of JNI and the RPC itself.
> This would be very helpful for {{fuse_dfs}}, since that program calls {{hdfsTell}} before
every {{write}} or {{read}} operation.  This can be quite a lot of overhead, since writes
may be as small as 4kb (depends on FUSE configuration, kernel version, etc.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message