hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rich Haase (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-1540) distcp should support an exclude list
Date Fri, 08 May 2015 19:25:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535330#comment-14535330

Rich Haase commented on HADOOP-1540:

[~3opan] Thanks for the comments!  

#1 and #3 I'll fix those space issues for the next rev of the patch.

#2 and #4 were added because checkstyle failed if I didn't make those changes.  I'd have preferred
to leave them alone.  Maybe someone can comment on how to avoid these kinds of checkstyle

#5 You are absolutely right.  My initial pass at the patch used regex patterns.  I switched
that logic only because at the time I was doing exclusion filtering in the CopyMapper and
compiling lots of regex in every mapper was likely to be expensive with large filter lists.
 Since we are only doing filtering while building the CopyListing it's probably not as big
a deal to use regex, although I am open to alternate suggestions.

> distcp should support an exclude list
> -------------------------------------
>                 Key: HADOOP-1540
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1540
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: util
>    Affects Versions: 2.6.0
>            Reporter: Senthil Subramanian
>            Assignee: Rich Haase
>            Priority: Minor
>              Labels: BB2015-05-TBR, patch
>         Attachments: HADOOP-1540.003.patch, HADOOP-1540.004.patch, HADOOP-1540.005.patch,
> There should be a way to ignore specific paths (eg: those that have already been copied
over under the current srcPath). 

This message was sent by Atlassian JIRA

View raw message