hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahmed Mahran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-8416) Short circuit remote reads from shared storage
Date Sat, 16 May 2015 19:05:00 GMT

     [ https://issues.apache.org/jira/browse/HDFS-8416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ahmed Mahran updated HDFS-8416:
-------------------------------
           Description: 
In a Hadoop cluster configuration that employs a shared storage system, HDFS read and write
operations are very expensive in terms of network bandwidth consumption.

For a DFS client to read a block from a remote datanode, the block is transmitted first from
the shared storage to the datanode then from the datanode to the DFS client. Short circuiting
the shared storage to datanode hop and allowing the client to directly access the shared storage
would improve the performance substantially.

This blog post describes the issue and provides a hack for the remote read.
http://www.badrit.com/blog/2015/3/20/hdfs-short-circuit-shared-storage-remote-read-hacking-the-hdfs-short-circuit-local-read-for-short-circuiting-remote-reads-from-a-shared-storage

  was:
In a Hadoop cluster configuration that employs a shared storage system, HDFS read and write
operations are very expensive in terms of network bandwidth consumption.

For a DFS client to read a block from a remote datanode, the block is transmitted first from
the shared storage to the datanode then from the datanode to the DFS client. Short circuiting
the shared storage to datanode hop and allowing the client to directly access the shared storage
would improve the performance substantially.

This document describes the issue and provides a hack for the remote read.
https://docs.google.com/document/d/16wvaFDN0R10jIX1vLlEJpJh-KhJR8YNO4Pt3v9FAfvQ

    External issue URL: http://www.badrit.com/blog/2015/3/20/hdfs-short-circuit-shared-storage-remote-read-hacking-the-hdfs-short-circuit-local-read-for-short-circuiting-remote-reads-from-a-shared-storage
 (was: https://docs.google.com/document/d/16wvaFDN0R10jIX1vLlEJpJh-KhJR8YNO4Pt3v9FAfvQ)

> Short circuit remote reads from shared storage
> ----------------------------------------------
>
>                 Key: HDFS-8416
>                 URL: https://issues.apache.org/jira/browse/HDFS-8416
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, hdfs-client, nfs, performance
>            Reporter: Ahmed Mahran
>
> In a Hadoop cluster configuration that employs a shared storage system, HDFS read and
write operations are very expensive in terms of network bandwidth consumption.
> For a DFS client to read a block from a remote datanode, the block is transmitted first
from the shared storage to the datanode then from the datanode to the DFS client. Short circuiting
the shared storage to datanode hop and allowing the client to directly access the shared storage
would improve the performance substantially.
> This blog post describes the issue and provides a hack for the remote read.
> http://www.badrit.com/blog/2015/3/20/hdfs-short-circuit-shared-storage-remote-read-hacking-the-hdfs-short-circuit-local-read-for-short-circuiting-remote-reads-from-a-shared-storage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message