hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "bc Wong (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-941) Datanode xceiver protocol should allow reuse of a connection
Date Thu, 08 Apr 2010 23:28:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855205#action_12855205
] 

bc Wong commented on HDFS-941:
------------------------------

I replaced the size-of-one cache with a more generic cache, which is also a global shared
cache. There is a new TestParallelRead, which test the concurrent use of a DFSInputStream
with concurrent readers. There's a clear speed difference with vs without the patch. Each
thread does 1024 # of reads.

Trunk:
{noformat}
Report: 4 threads read 236953 KB (across 1 file(s)) in 5.879s; average 40304.98384078925 KB/s
Report: 4 threads read 238873 KB (across 1 file(s)) in 5.063s; average 47180.13035749556 KB/s
Report: 4 threads read 236068 KB (across 1 file(s)) in 5.93s; average 39809.10623946037 KB/s
Report: 16 threads read 942666 KB (across 1 file(s)) in 13.524s; average 69703.19432120674
KB/s
Report: 16 threads read 947015 KB (across 1 file(s)) in 13.401s; average 70667.48750093277
KB/s
Report: 16 threads read 948768 KB (across 1 file(s)) in 12.932s; average 73365.91401175379
KB/s
Report: 8 threads read 469529 KB (across 2 file(s)) in 5.436s; average 86373.98822663723 KB/s
Report: 8 threads read 455428 KB (across 2 file(s)) in 5.363s; average 84920.38038411336 KB/s
Report: 8 threads read 469005 KB (across 2 file(s)) in 5.713s; average 82094.34622790127 KB/s
{noformat}

Patched:
{noformat}
Report: 4 threads read 236845 KB (across 1 file(s)) in 3.612s; average 65571.70542635658 KB/s
Report: 4 threads read 238803 KB (across 1 file(s)) in 4.371s; average 54633.49347975291 KB/s
Report: 4 threads read 240241 KB (across 1 file(s)) in 4.395s; average 54662.34357224119 KB/s
Report: 16 threads read 938652 KB (across 1 file(s)) in 9.044s; average 103787.26227333037
KB/s
Report: 16 threads read 943999 KB (across 1 file(s)) in 8.59s; average 109895.11059371362
KB/s
Report: 16 threads read 938546 KB (across 1 file(s)) in 9.081s; average 103352.71445876005
KB/s
Report: 8 threads read 478534 KB (across 2 file(s)) in 3.376s; average 141745.85308056872
KB/s
Report: 8 threads read 467412 KB (across 2 file(s)) in 3.623s; average 129012.42064587357
KB/s
Report: 8 threads read 475349 KB (across 2 file(s)) in 3.49s; average 136203.15186246418 KB/s
{noformat}

bq. The edits to the docs in DataNode.java are good - if possible they should probably move
into HDFS-1001 though, no?
The addition to the docs doesn't apply to HDFS-1001, in which the DataXceiver still actively
closes all sockets after each use.

Todd, the new patch addresses the rest of your comments.


> Datanode xceiver protocol should allow reuse of a connection
> ------------------------------------------------------------
>
>                 Key: HDFS-941
>                 URL: https://issues.apache.org/jira/browse/HDFS-941
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: bc Wong
>         Attachments: HDFS-941-1.patch, HDFS-941-2.patch
>
>
> Right now each connection into the datanode xceiver only processes one operation.
> In the case that an operation leaves the stream in a well-defined state (eg a client
reads to the end of a block successfully) the same connection could be reused for a second
operation. This should improve random read performance significantly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message