hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1783) Ability for HDFS client to write replicas in parallel
Date Tue, 05 Jun 2012 05:27:24 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289160#comment-13289160

stack commented on HDFS-1783:

Do you want to zero out the socket in the array after you call close so the closeSocket method
is then idempotent (presuming you check for null socket before calling close on it?)

Do we need that set method in MultiDataOutputStream?  Can it be immutable?  Ditto for MultiDataInputStream.

Would it be worth factoring out the common code where we write the data nodes into methods?
 E.g. where ack from pipeline and log it then up the seqno... and the later bit where we process
response from datanode?

Should this log just be removed and instead just change the log that follows so that it prints
'parallel=' instead of 'pipeline=' if parallel flag set?

Put these lines together?

+          Socket s;
+          s = createSocketForPipeline(nodes[curNode], pipelineDepth, dfsClient);
+          sockets[curNode] = s;

Its ok closing streams after we close sockets?

Patch looks great.

> Ability for HDFS client to write replicas in parallel
> -----------------------------------------------------
>                 Key: HDFS-1783
>                 URL: https://issues.apache.org/jira/browse/HDFS-1783
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>            Reporter: dhruba borthakur
>            Assignee: Lars Hofhansl
>         Attachments: HDFS-1783-trunk-v2.patch, HDFS-1783-trunk-v3.patch, HDFS-1783-trunk.patch
> The current implementation of HDFS pipelines the writes to the three replicas. This introduces
some latency for realtime latency sensitive applications. An alternate implementation that
allows the client to write all replicas in parallel gives much better response times to these

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message