hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-922) Optimize small reads and seeks
Date Tue, 30 Jan 2007 18:05:36 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468720
] 

dhruba borthakur commented on HADOOP-922:
-----------------------------------------

I agree with your comments. The amount of data cached by the receiving size of the TCP connection
could possibly depend on the latency of transfer and the amount of memory available to the
sender and received.

By default, the TCP sending window size is usually 128KB and receiving windows size is 4MB.
I propose that I change the above patch to trigger the optmization only if the skip length
is <= 128KB.  

> Optimize small reads and seeks
> ------------------------------
>
>                 Key: HADOOP-922
>                 URL: https://issues.apache.org/jira/browse/HADOOP-922
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: dhruba borthakur
>         Assigned To: dhruba borthakur
>         Attachments: smallreadseek3.patch
>
>
> A seek on a DFSInputStream causes causes the next read to re-open the socket connection
to the datanode and fetch the remainder of the block all over again. This is not optimal.
> A small read followed by a small positive seek could re-utilize the data already fetched
from the datanode as part of the previous read. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message