hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4386) RPC support for large data transfers.
Date Fri, 10 Oct 2008 17:47:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638605#action_12638605

Raghu Angadi commented on HADOOP-4386:

> TransferTo and transferFrom are not async operations, but blocking operations.

It is mixed. Disk i/o is blocking, but socket i/o obeys blocking setting of the socket. So
if you are transferring from a file to socket, read from the file is blocking (but not readFully()),
and write to the socket is non-blocking. 

> It seems that in order to eliminate extra sockets and threads we're forced to do at least
one buffer copy. Am I missing something?

Not necessarily. The main intention in HADOOP-3856 (that could apply here, if not initially)
is that Datanode will have a fixed number of threads per partition, say 5. These threads invoke
transferTo(). As long as 5 threads can keep the disks busy, this is essentially doing as best
as thread-per-connection could do. We will of course make '5' configurable with a good default.

will check MR Shuffle protocol to see how it works now.

> RPC support for large data transfers.
> -------------------------------------
>                 Key: HADOOP-4386
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4386
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, ipc
>            Reporter: Raghu Angadi
> Currently HDFS has a socket level protocol for serving HDFS data to clients. Clients
do not use RPCs to read or write data. Fundamentally there is no reason why this data transfer
 can not use RPCs.
> This jira is place holder for any porting Datanode transfers to RPC. This topic has been
discussed in varying detail many times, the latest being in the context of HADOOP-3856. There
are quite a few issues to be resolved both at API level and at implementation level. 
> We should probably copy some of the comments from HADOOP-3856 to here.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message