hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap
Date Thu, 15 Aug 2013 23:00:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741623#comment-13741623
] 

Andrew Wang commented on HDFS-4953:
-----------------------------------

Great, I think we're very close.

bq.     * comapre the genStamp here.  That way, we will not return a shorter 
typo
{code}
    return new ClientMmapManager(conf.getInt(DFS_CLIENT_MMAP_CACHE_SIZE,
                                  DFS_CLIENT_MMAP_CACHE_SIZE_DEFAULT),
                      conf.getLong(DFS_CLIENT_MMAP_CACHE_TIMEOUT_MS,
                                  DFS_CLIENT_MMAP_CACHE_TIMEOUT_MS_DEFAULT));
{code}
weird spacing

* I think the CacheCleaner delay and period could still be improved. An initial delay of timeout
still makes sense to me, and maybe we could introduce a conf parameter that affects the period.
I doubt that this will be twiddled all that much, so let's set good defaults. A more aggressive
default period of {{timeout/4}} might make more sense.
* javac warnings still need to be addressed (looks like via the ignore file)
* I'd like to see yet more class javadoc for ZeroCopyCursor (man page level). I bet many HDFS
users will be eager to use this more efficient interface but won't necessarily understand
e.g. ownership of the fallback buffer, and how to handle short vs. not short reads. It's also
important since there isn't a design doc. We can defer this to a follow-on JIRA though if
you prefer.

                
> enable HDFS local reads via mmap
> --------------------------------
>
>                 Key: HDFS-4953
>                 URL: https://issues.apache.org/jira/browse/HDFS-4953
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 2.3.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, HDFS-4953.003.patch,
HDFS-4953.004.patch, HDFS-4953.005.patch, HDFS-4953.006.patch, HDFS-4953.007.patch
>
>
> Currently, the short-circuit local read pathway allows HDFS clients to access files directly
without going through the DataNode.  However, all of these reads involve a copy at the operating
system level, since they rely on the read() / pread() / etc family of kernel interfaces.
> We would like to enable HDFS to read local files via mmap.  This would enable truly zero-copy
reads.
> In the initial implementation, zero-copy reads will only be performed when checksums
were disabled.  Later, we can use the DataNode's cache awareness to only perform zero-copy
reads when we know that checksum has already been verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message