hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-516) Low Latency distributed reads
Date Wed, 16 Sep 2009 22:35:57 GMT

    [ https://issues.apache.org/jira/browse/HDFS-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756264#action_12756264

Todd Lipcon commented on HDFS-516:

bq. Unless you want to add checksums for better comparison, I don't think it is every essential.

I disagree here - checksums are responsible for a reasonable amount of the overhead in the
current HDFS implementation. If this is mostly seen as a testing ground for performance improvements,
we can't be comparing apples to oranges. If we don't want to implement checksums in RadFs,
then the other option for fair comparison is to remove checksums from Hdfs.

If this is about a testing ground for new features, then that makes sense, but I understood
this mostly as a "turbo HDFS client experimentation"

> Low Latency distributed reads
> -----------------------------
>                 Key: HDFS-516
>                 URL: https://issues.apache.org/jira/browse/HDFS-516
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jay Booth
>            Priority: Minor
>         Attachments: hdfs-516-20090912.patch
>   Original Estimate: 168h
>  Remaining Estimate: 168h
> I created a method for low latency random reads using NIO on the server side and simulated
OS paging with LRU caching and lookahead on the client side.  Some applications could include
lucene searching (term->doc and doc->offset mappings are likely to be in local cache,
thus much faster than nutch's current FsDirectory impl and binary search through record files
(bytes at 1/2, 1/4, 1/8 marks are likely to be cached)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message