hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jay Booth (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-516) Low Latency distributed reads
Date Wed, 16 Sep 2009 00:58:57 GMT

    [ https://issues.apache.org/jira/browse/HDFS-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755823#action_12755823

Jay Booth commented on HDFS-516:

Yeah, I was puzzled by the performance too.  I dug through the DFS code and I'm saving a bit
on new socket and object creation, maybe a couple instructions here and there, but that shouldn't
add up to 100 seconds for a gigabyte (approx 20 blocks).  I'm calling read() a bajillion times
in a row so it's conceivable (although unlikely) that I'm pegging the CPU and that's the limiting

I'm busy for a couple days but will get back to you with some figures from netstat, top and
whatever else I can think of, along with another streaming case that works with read(b, off,
len) to see if that changes things.  I'll do a little more digging into DFS as well to see
if I can isolate the cause.  I definitely did run them several times on the same machine and
another time on a different cluster with similar results, so it wasn't simply bad luck on
the rack placement on EC2 (well maybe but unlikely).

Will report back when I have more numbers.  After I get those, my roadmap for this is to add
checksum support and better DatanodeInfo caching.  User groups would come after that.

> Low Latency distributed reads
> -----------------------------
>                 Key: HDFS-516
>                 URL: https://issues.apache.org/jira/browse/HDFS-516
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jay Booth
>            Priority: Minor
>         Attachments: hdfs-516-20090912.patch
>   Original Estimate: 168h
>  Remaining Estimate: 168h
> I created a method for low latency random reads using NIO on the server side and simulated
OS paging with LRU caching and lookahead on the client side.  Some applications could include
lucene searching (term->doc and doc->offset mappings are likely to be in local cache,
thus much faster than nutch's current FsDirectory impl and binary search through record files
(bytes at 1/2, 1/4, 1/8 marks are likely to be cached)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message