hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "eric baldeschwieler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2154) Non-interleaved checksums would optimize block transfers.
Date Thu, 29 Nov 2007 07:26:43 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546587

eric baldeschwieler commented on HADOOP-2154:


We should not be doing these copies and interleaves if we can avoid them.
A lot of change here, but if we could move to a protocol where the client requests a buffer
of bytes, rather than just pushing bytes, we could start the response with a CRCs list, followed
by the bytes.  This would require less RAM on the client side (I think).

Can we just memory map the block and then copy the requested chunk it directly to the socket
or use other tricks to reduce copies further?  (I'm NIO naive)

> Non-interleaved checksums would optimize block transfers.
> ---------------------------------------------------------
>                 Key: HADOOP-2154
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2154
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.14.0
>            Reporter: Konstantin Shvachko
>            Assignee: Rajagopal Natarajan
>             Fix For: 0.16.0
> Currently when a block is transfered to a data-node the client interleaves data chunks
with the respective checksums. 
> This requires creating an extra copy of the original data in a new buffer interleaved
with the crcs.
> We can avoid extra copying if the data and the crc are fed to the socket one after another.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message