hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jay Booth (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-516) Low Latency distributed reads
Date Fri, 31 Jul 2009 21:14:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737713#action_12737713

Jay Booth commented on HDFS-516:

Here's some architectural overview and a general request for comments on the matter, I'll
be away and busy the next few days but should be able to get back to this in the middle of
next week.

The basic workflow is I created a RadFileSystem (RandomAccessDistributed FS) which wraps DistributedFileSystem
and delegates to it for everything except for getFSDataInputStream.  That returns a custom
FSDataInputStream which wraps a CachingByteService which itself wraps a RadFSByteService.
 The caching byte services share a cache which is managed by the RadFSClient class (could
maybe factor that away and put it in RadFileSystem instead).  They try to hit the cache, and
if they miss, they call the underlying RadFSClientByteService to get the requested page plus
a few pages of lookahead.  The RadFSClientByteService calls the namenode to get appropriate
block locations (todo, cache these effectively) and then calls RadNode, which is embedded
in DataNode via ServicePlugin and maintains an IPCServer and a set of FileChannels to the
local blocks.  On repeated requests for the same data, the RadFSClient tends to favor going
to the same host, figuring that the benefit of hitting the DataNode's OS cache for the given
bytes outweighs the penalty of hopping a rack in terms of reducing latency (untested assumption).

The intended use case is pretty different from MapReduce so I think this should be a contrib
module that has to be explicitly invoked by clients.  It really underperforms DFS in terms
of streaming but should (haven't tested extensively outside of localhost) significantly outperform
it in terms of random reads.  In terms of files with 'hot paths', such as lucene indices or
binary search over a normal file, cache hit percentage is likely to be pretty high so it should
probably perform pretty well.  Currently, it makes a fresh request to the NameNode for every
read, which is inefficient but more likely to be correct.  Going forward, I'd like to tighten
this up, make sure it plays nice with append and get it into a future Hadoop release.   

> Low Latency distributed reads
> -----------------------------
>                 Key: HDFS-516
>                 URL: https://issues.apache.org/jira/browse/HDFS-516
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jay Booth
>            Priority: Minor
>         Attachments: radfs.patch
>   Original Estimate: 168h
>  Remaining Estimate: 168h
> I created a method for low latency random reads using NIO on the server side and simulated
OS paging with LRU caching and lookahead on the client side.  Some applications could include
lucene searching (term->doc and doc->offset mappings are likely to be in local cache,
thus much faster than nutch's current FsDirectory impl and binary search through record files
(bytes at 1/2, 1/4, 1/8 marks are likely to be cached)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message