hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Antony (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-8065) discp should have an option to compress data while copying.
Date Mon, 13 Feb 2012 18:43:04 GMT
discp should have an option to compress data while copying.

                 Key: HADOOP-8065
                 URL: https://issues.apache.org/jira/browse/HADOOP-8065
             Project: Hadoop Common
          Issue Type: Improvement
          Components: fs
    Affects Versions: 0.20.2
            Reporter: Suresh Antony
            Priority: Minor
             Fix For: 0.20.2

We would like compress the data while transferring from our source system to target system.
One way to do this is to write a map/reduce job to compress that after/before being transferred.
This looks inefficient. 
Since distcp already reading writing data it would be better if it can accomplish while doing

Flip side of this is that distcp -update option can not check file size before copying data.
It can only check for the existence of file. 

So I propose if -compress option is given then file size is not checked.

Also when we copy file appropriate extension needs to be added to file depending on compression

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message