hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-941) Datanode xceiver protocol should allow reuse of a connection
Date Mon, 06 Jun 2011 04:11:47 GMT

     [ https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Todd Lipcon updated HDFS-941:
-----------------------------

    Attachment: hdfs-941.txt

Attached patch rebased against trunk and improved in the following ways:

- moved SocketCache instance to DFSClient instead of being static
- fixed bugs that could have caused concurrent modification exceptions in SocketCache itself
-- now uses iterator remove methods in eviction code.
- fixed SocketCache.size() to be synchronized
- renamed BlockSender.blockReadFully to sentEntireByteRange, and made it get set to true as
soon as it has sent the entire requested length. This was necessary so that the client and
server agree on when a status code will be expected.
- DataXceiver: renamed the sockReuseTimeout to socketKeepaliveTimeout - I think this is a
slightly clearer name
- Fixed some assertions in new tests to use junit assertions instead of Java assertions (as
suggested in an above comment)
- changed TestParallelRead to disable the clienttrace log, since jstack was showing that it
was causing a ton of contention
- couple of misc style cleanups

I also ran TestParallelRead before and after the patch, after bumping the N_ITERATIONS to
10240 and changing the proportion of non-positional reads to 0. The results are:

*without patch*:
11/06/05 20:32:54 INFO hdfs.TestParallelRead: === Report: 4 threads read 2619994 KB (across
1 file(s)) in 25.762s; average 101699.94565639313 KB/s
11/06/05 20:33:34 INFO hdfs.TestParallelRead: === Report: 16 threads read 10470506 KB (across
1 file(s)) in 40.583s; average 258002.26695907154 KB/s
11/06/05 20:34:00 INFO hdfs.TestParallelRead: === Report: 8 threads read 5232371 KB (across
2 file(s)) in 25.484s; average 205319.8477476063 KB/s


*with patch*:
11/06/05 20:35:45 INFO hdfs.TestParallelRead: === Report: 4 threads read 2626843 KB (across
1 file(s)) in 10.208s; average 257331.7985893417 KB/s
11/06/05 20:36:13 INFO hdfs.TestParallelRead: === Report: 16 threads read 10492178 KB (across
1 file(s)) in 27.046s; average 387938.25334615103 KB/s
11/06/05 20:36:25 INFO hdfs.TestParallelRead: === Report: 8 threads read 5236253

> Datanode xceiver protocol should allow reuse of a connection
> ------------------------------------------------------------
>
>                 Key: HDFS-941
>                 URL: https://issues.apache.org/jira/browse/HDFS-941
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: bc Wong
>         Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, HDFS-941-3.patch,
HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.patch, HDFS-941-6.patch, HDFS-941-6.patch,
hdfs-941.txt, hdfs941-1.png
>
>
> Right now each connection into the datanode xceiver only processes one operation.
> In the case that an operation leaves the stream in a well-defined state (eg a client
reads to the end of a block successfully) the same connection could be reused for a second
operation. This should improve random read performance significantly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message