hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brahma Reddy Battula (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-445) Parallel data/socket writing for DFSOutputStream
Date Sat, 03 Jul 2010 18:28:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884959#action_12884959

Brahma Reddy Battula commented on HADOOP-445:

I am using the 20.1 version ,by using two threads i am doing the all the operations(write,read,copyfromlocal..)
continously after some time im getting the following exception

ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(,
storageID=DS-2095362429-, infoPort=50075, ipcPort=50020):DataXceiver
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready
for write. ch : java.nio.channels.SocketChannel[connected local=/ remote=/]

> Parallel data/socket writing for DFSOutputStream
> ------------------------------------------------
>                 Key: HADOOP-445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-445
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.5.0
>            Reporter: Benjamin Reed
>            Assignee: Sameer Paranjpye
>         Attachments: fastClientWrite.patch
> Currently, as DFS clients output blocks they write the entire block to disk before starting
to transmit to the datanode. By writing to disk the client is able to retry a block write
if the datanode files in the middle of a block transfer. Writing to disk and then to the datanode
adds latency. Hopefully, the common case is that block transfers to datanodes are successful.
This patch writes to the datanode and the disk in parallel. If the write to the datanode fails,
it falls back to current behavior.
> In my tests of transmits of 237M and 946M datasets using -copyFromLocal I'm seeing a
20-25% improvement in throughput.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message