hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jothi Padmanabhan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1276) Shuffle connection logic needs correction
Date Mon, 03 May 2010 05:07:58 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863233#action_12863233
] 

Jothi Padmanabhan commented on MAPREDUCE-1276:
----------------------------------------------

bq. Why is the input stream drained at all

As per http://java.sun.com/j2se/1.5.0/docs/guide/net/http-keepalive.html, it is a good practice
to read the entire response body if getInputStream returns successfully. So, if getInputStream
returns success but there is some corruption in data (say some length mismatch), it might
be a good idea to siphon off the data from the socket. 

> Shuffle connection logic needs correction 
> ------------------------------------------
>
>                 Key: MAPREDUCE-1276
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1276
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 0.21.0
>            Reporter: Jothi Padmanabhan
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: patch-1276.txt
>
>
> While looking at the code with Amareshwari, we realized that  {{Fetcher#copyFromHost}}
marks connection as successful when {{url.openConnection}} returns. This is wrong. Connection
is done inside implicitly inside {{getInputStream}}; we need to split {{getInputStream}} into
{{connect}} and {{getInputStream}} to handle the connection and read time out strategies correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message