hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Clampffer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8765) Implement local block reader in libhdfspp
Date Tue, 18 Aug 2015 22:55:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702143#comment-14702143

James Clampffer commented on HDFS-8765:

Hi Li,
The short circuit interface will be zero copy, at least in terms of userspace copies.  You'll
provide a buffer and data will be read directly into it through a pread call or similar.

I'm not planning on supporting the hdfs centralized cache in my initial implementation for
two reasons.  The first is simplicity; I'd like to get this up and running as soon as possible.
 The second is that short circuit reads will automatically benefit from the local machine's
page cache.  On modern operating systems these work very well, and because this is implemented
in c++ we don't have to worry about pinning memory and some of the issues that come with the
JVM heap.  

After I get the first iteration finished up I'd be really interested in seeing some benchmarks
for your use case to see if explicit cache management would help things out.  It's certainly
something I've thought about adding later on.

> Implement local block reader in libhdfspp
> -----------------------------------------
>                 Key: HDFS-8765
>                 URL: https://issues.apache.org/jira/browse/HDFS-8765
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
> Implement a block reader that uses the hdfs short circuit protocol to read colocated
data as efficiently as possible.  Implementation will be based on BlockReaderLocal.java +
the associated JNI bindings.

This message was sent by Atlassian JIRA

View raw message