hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-2154) Non-interleaved checksums would optimize block transfers.
Date Thu, 29 Nov 2007 21:37:43 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546850
] 

shv edited comment on HADOOP-2154 at 11/29/07 1:36 PM:
-----------------------------------------------------------------------

Rajagopal, I do not see how the data:header ratio is decreasing here.

This issue is mainly about removing the interleaving buffer layout. Namely, now we partition
the original data into chunks, 
calculate crc for each chunk and create the following buffer, which subsequently is transferred
to a data-node:
| data chunk 1 | crc for data chunk 1 |  data chunk 2 | crc for data chunk 2 |  ... | data
chunk n | crc for data chunk n | 
I propose to change it [back] to 
| the original data (+not+ partitioned into chunks) | crcs for the original data |

If you add a header before each data and crc chunk then in current approach you will have
2*n headers, while in the proposed 
approach there will be only 2. So the data:header ratio will increase: (|data| + |crc|) /
2n < (|data| + |crc|) / 2

This should let us get rid of that extra buffer that is used to collect all the interleaved
pieces together.

And thus the issue is not about "writing the chunks to the socket directly", but rather about
removing chunks all together.
Imo, this is related to both reads and writes. May be reads and writes should even share this
code.
Removing other redundant buffers is a part of a different issue.

Eric, why do you think transferring crc before the data would require less RAM on the client?
If it does then it definitely makes sense to send crcs before the data bytes.

      was (Author: shv):
    Rajagopal, I do not see how the data:header ratio is decreasing here.

This issue is mainly about removing the interleaving buffer layout. Namely, now we partition
the original data into chunks, 
calculate crc for each chunk and create the following buffer, which subsequently is transferred
to a data-node:
| data chunk 1 | crc for data chunk 1 |  data chunk 2 | crc for data chunk 2 |  ... | data
chunk n | crc for data chunk n | 
I propose to change it [back] to 
| the original data (+not+ partitioned into chunks) | crc for for the original data |

If you add a header before each data and crc chunk then in current approach you will have
2*n headers, while in the proposed 
approach there will be only 2. So the data:header ratio will increase: (|data| + |crc|) /
2n < (|data| + |crc|) / 2

This should let us get rid of that extra buffer that is used to collect all the interleaved
pieces together.

And thus the issue is not about "writing the chunks to the socket directly", but rather about
removing chunks all together.
Imo, this is related to both reads and writes. May be reads and writes should even share this
code.
Removing other redundant buffers is a part of a different issue.

Eric, why do you think transferring crc before the data would require less RAM on the client?
If it does then it definitely makes sense to send crcs before the data bytes.
  
> Non-interleaved checksums would optimize block transfers.
> ---------------------------------------------------------
>
>                 Key: HADOOP-2154
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2154
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.14.0
>            Reporter: Konstantin Shvachko
>            Assignee: Rajagopal Natarajan
>             Fix For: 0.16.0
>
>
> Currently when a block is transfered to a data-node the client interleaves data chunks
with the respective checksums. 
> This requires creating an extra copy of the original data in a new buffer interleaved
with the crcs.
> We can avoid extra copying if the data and the crc are fed to the socket one after another.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message