hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
Date Wed, 18 Sep 2013 23:50:53 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771412#comment-13771412

Colin Patrick McCabe commented on HDFS-5191:

My feeling is that we should support passing a null {{ByteBufferFactory}} to mean "don't create
fallback {{ByteBuffers}}."  Using a {{RejectingByteBufferFactory}} is another reasonable choice,
but it would require more typing for some people.

I'll add an overload for the empty EnumSet case.

ByteBufferPool is a better name, I agree.

I suppose, given that {{FSDataInputStream}} is in {{org.apache.hadoop.io}}, ByteBufferPool/Factory
should be as well.

{{ByteBufferPool}} implementations don't need thread-safety unless multiple read calls are
going to be made in parallel using the same pool.  I'll add that information to the JavaDoc.

I agree that the "fallback fallback" path is something that still needs to be done.  The problem
is, there isn't a very efficient way to do it, since we'd have to read into a byte array,
and then copy to the direct byte buffer.  We could do better, if we could ask the ByteBufferPool
for a non-direct buffer.  (i.e., an array-backed buffer).  Will this "fallback fallback" case
be common enough to motivate this kind of API?

The disadvantage of this is that then our read function would sometimes return direct byte
buffers, and sometimes not, which could lead to code working on local filesystems, and then
failing on HDFS (if it tried to call ByteBuffer#array).
> revisit zero-copy API in FSDataInputStream to make it more intuitive
> --------------------------------------------------------------------
>                 Key: HDFS-5191
>                 URL: https://issues.apache.org/jira/browse/HDFS-5191
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client, libhdfs
>    Affects Versions: HDFS-4949
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5191-caching.001.patch
> As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more
intuitive for new users.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message