hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-2758) Reduce memory copies when data is read from DFS
Date Fri, 15 Feb 2008 19:10:08 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12569375#action_12569375
] 

rangadi edited comment on HADOOP-2758 at 2/15/08 11:09 AM:
----------------------------------------------------------------

Comparision of single instance of 'dfs -cat 5Gbfile > /dev/null" with 'cat 5Gbfile >
/dev/null'. All the data resides locally on a 4 disk RAID0 partition : 

||  min:sec || cat || dfs -cat with 0.16 || dfs -cat with the patch ||
| run 1 | 2:40 | 3:44 | 3:24 |
| run 2 | 2:56 | 3:05 | 3:51 |
| run 3 | 3:01 | 3:18 | 2:51 |

What would you conclude? Both of the obvious conclusions are incorrect :
# dfs -cat is almost as good as simple cat.
# this patch does not help much.

 If we had a single disk partition, the numbers would be even closer.




      was (Author: rangadi):
    
Comparision of single instance of 'dfs -cat 5Gbfile > /dev/null" with 'cat 5Gbfile >
/dev/null'. All the data resides locally on a 4 disk RAID0 partition : 

||  min:sec || cat || dfs -cat with 0.16 || dfs -cat with the patch ||
| run 1 | 2:40 | 3:44 | 3:24 |
| run 2 | 2:56 | 3:05 | 3:51 |
| run 3 | 3:01 | 3:18 | 2:51 |

What would you conclude? Both of the obvious conclusions are incorrect :
# dfs -cat is almost as good as simple cat.
# this patch does not help mu.

 If we had a single disk partition, the numbers would be even closer.



  
> Reduce memory copies when data is read from DFS
> -----------------------------------------------
>
>                 Key: HADOOP-2758
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2758
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-2758.patch
>
>
> Currently datanode and client part of DFS perform multiple copies of data on the 'read
path' (i.e. path from storage on datanode to user buffer on the client). This jira reduces
these copies by enhancing data read protocol and implementation of read on both datanode and
the client. I will describe the changes in next comment.
> Requirement is that this fix should reduce CPU used and should not cause regression in
any benchmarks. It might not improve the benchmarks since most benchmarks are not cpu bound.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message