hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-941) Datanode xceiver protocol should allow reuse of a connection
Date Tue, 07 Jun 2011 05:19:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045278#comment-13045278
] 

stack commented on HDFS-941:
----------------------------

I took a look at patch.  It looks good to me.  Minor comments below.  Meantime I've patched
it into an hadoop 0.22 and am running a loading on it overnight to see if can find probs.

What is this about?

+    <dependency org="com.google.collections" name="google-collections" rev="${google-collections.version}"
conf="common->default"/>

When I go to the google-collections home page it says:

{code}
This library was renamed to Guava!
What you see here is ancient and unmaintained. Do not use it.
{code}

Nice doc. changes in BlockReader.

If you make another version of this patch, change the mentions of getEOS in comments to be
'eos' to match the change of variable name.

When you create a socket inside in getBlockReader, you've added this:

{code}
 469 +        sock.setTcpNoDelay(true);    
{code}

to the socket config before connect.  That is intentional?  (This is new with this patch.
Also, old code used set timer after making connection -- which seems off... in your patch
you set timeout then connect).

You think 16 a good number for the socket cache (doesn't seem easily chanageable)?

Nice cleanup of description in DataNode.java

One note is that this patch looks 'safe'; we default to closing the connection if anything
untoward which should be just the behavior DN had before this patch.

TestParallelRead is sweet.






> Datanode xceiver protocol should allow reuse of a connection
> ------------------------------------------------------------
>
>                 Key: HDFS-941
>                 URL: https://issues.apache.org/jira/browse/HDFS-941
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: bc Wong
>         Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, HDFS-941-3.patch,
HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.patch, HDFS-941-6.patch, HDFS-941-6.patch,
hdfs-941.txt, hdfs-941.txt, hdfs941-1.png
>
>
> Right now each connection into the datanode xceiver only processes one operation.
> In the case that an operation leaves the stream in a well-defined state (eg a client
reads to the end of a block successfully) the same connection could be reused for a second
operation. This should improve random read performance significantly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message