hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
Date Wed, 18 Sep 2013 23:08:51 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771369#comment-13771369
] 

Chris Nauroth commented on HDFS-5191:
-------------------------------------

I reviewed the current API and caught up on all prior discussion here and in HDFS-4953.  [~cmccabe],
thank you for incorporating the feedback.  I'm going to focus on the interface here.  I'll
review the implementation details soon too.  I like to review APIs by writing a little code
against them.  This is what I came up with (imports trimmed for brevity):

{code}
class Main extends Configured implements Tool {
  private static final int BUFFER_MAX_LENGTH = 4 * 1024 * 1024; // 4 MB
  private static final Log LOG = LogFactory.getLog(Main.class);

  @Override
  public int run(String[] args) {
    FileSystem fs = null;
    FSDataInputStream is = null;

    try {
      fs = FileSystem.get(this.getConf());
      is = fs.open(new Path(args[0]));
      ByteBufferFactory bbf = new SimpleByteBufferFactory();
      EnumSet<ReadOption> readOpts = EnumSet.noneOf(ReadOption.class);
      for (;;) {
        ByteBuffer bb = is.read(bbf, BUFFER_MAX_LENGTH, readOpts);
        if (bb == null) break; // EOF
        // Use data from the buffer here.
        bbf.putBuffer(bb);
      }
    } catch (IOException e) {
      LOG.error("Unexpected IOException", e);
    } finally {
      IOUtils.cleanup(LOG, is, fs);
    }

    return 0;
  }

  public static void main(String[] args) throws Exception {
    System.exit(ToolRunner.run(new Main(), args));
  }
}
{code}

Can someone check that this code (written by someone looking at the API for the first time)
is correct?  If so, then that's a good sign that we're on the right track.  :-)  I think this
code sample shows that most of the earlier feedback has been addressed.  Specifically:

# It's a single interface for client read code for both the cached and uncached case.  I didn't
need to check flags or catch a special exception or downcast to an HDFS class to handle copying
vs. zero-copy.
# There is generally less code for the client to write, because there are fewer things that
the client needs to control.  (i.e. no {{setAllowShortReads}})
# I did not need to preallocate a fallback buffer that wouldn't be used in the mmap'd case.
# I did not need to catch exceptions for main flow control.
# The {{ByteBufferFactory}} interface would allow the client to control ownership of direct
buffers for fallback (and the {{SimpleByteBufferFactory}} ships a default implementation that
does this).
# There is no sharing of mutable state between implicitly connected objects (streams and cursors).
# It looks close to the kind of code sample [~owen.omalley] wanted to achieve in the comments
of HDFS-4953.
# There had been discussion of returning array of {{ByteBuffer}} vs. returning a single {{ByteBuffer}}
with possibility of short read.  My preference is for the latter.  The former would have required
me to write another nested loop.  I need to code for short read anyway in case the final read
before EOF doesn't match my desired read length.

I think the last remaining open question is around the need for a client to check if zero-copy
is available and potentially run a different code path.  One way we could achieve that now
is by adding a {{RejectingByteBufferFactory}} that always throws, and clients that want that
behavior can use it instead of {{SimpleByteBufferFactory}}.  Does anyone have a concrete use
case for this?  Without a concrete use case, it's hard to say if the interface is sufficient.

Here are some suggestions, mostly minor adjustments on top of the current patch:

# Shall we add overloads of {{FSDataInputStream#read}} that don't require the caller to pass
{{ByteBufferFactory}} (assumes caller wants a new {{SimpleByteBufferFactory}}) and don't require
the caller to pass read opts (assumes none)?
# Can we rename {{ByteBufferFactory}} to {{ByteBufferPool}}?  [~vicaya] had made a similar
suggestion.  "Factory" implies per-call instance creation and "pool" communicates that it
needs to control the lifecycle of instances.
# Can we move those classes to the .io sub-package?  They aren't coupled to anything in .fs.
# We need to fully document the expected contract of {{ByteBufferPool}} for implementers.
 Some factors to consider:
## Thread-safety - Is it required for the implementation to guarantee thread-safety internally?
 (I assume thread-safety is only required if the caller intends to use the same one from multiple
threads.  Whatever the answer, let's document it.)
## Null - Is null a legal return value?  Is it possible for callers to pass null arguments?
 (Ideally, null is forbidden, but whatever the answer, let's document it.)
## Bounds-checking - What should implementers do if given a negative length?  (Runtime exception?)
# When the {{FSDataInputStream}} is not {{HasEnhancedByteBufferAccess}}, we try to fallback
to a copying read.  However, this failed when I tested my sample code against a local file
system, because it ultimately still called {{FSDataInputStream#read(ByteBuffer)}}, which throws
{{UnsupportedOperationException}} in the base class.  We'll need to fix this to achieve the
goal of clients writing a single code path that works for any {{FileSystem}}.
                
> revisit zero-copy API in FSDataInputStream to make it more intuitive
> --------------------------------------------------------------------
>
>                 Key: HDFS-5191
>                 URL: https://issues.apache.org/jira/browse/HDFS-5191
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client, libhdfs
>    Affects Versions: HDFS-4949
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5191-caching.001.patch
>
>
> As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more
intuitive for new users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message