hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jay Booth (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HDFS-516) Low Latency distributed reads
Date Fri, 31 Jul 2009 20:58:14 GMT

     [ https://issues.apache.org/jira/browse/HDFS-516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jay Booth updated HDFS-516:
---------------------------

    Status: Patch Available  (was: Open)

Here's the initial patch.  I have some pretty decent byte-consistency and integration tests
wrapping it but no actual applications built using it.  I started on a SequenceFileSearcher
that implemented binary search but dealing with the sync points was sticky so it's not working
yet (and commented out, with associated failing test).  I'm hoping to get that working in
the next couple weeks when I have some time as an example along with doing a performance comparison
between nutch FsDirectory using DistributedFileSystem and RadFileSystem to see if this shows
the gains that I'm thinking it will.  My only change to core HDFS was to add getFile() to
FSDatasetInterface so that the RadNode (plugged into datanode via ServicePlugin) can open
FileChannels.

> Low Latency distributed reads
> -----------------------------
>
>                 Key: HDFS-516
>                 URL: https://issues.apache.org/jira/browse/HDFS-516
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jay Booth
>            Priority: Minor
>         Attachments: radfs.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I created a method for low latency random reads using NIO on the server side and simulated
OS paging with LRU caching and lookahead on the client side.  Some applications could include
lucene searching (term->doc and doc->offset mappings are likely to be in local cache,
thus much faster than nutch's current FsDirectory impl and binary search through record files
(bytes at 1/2, 1/4, 1/8 marks are likely to be cached)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message