hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingliang Liu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-10397) Distcp should ignore -delete option if -diff option is provided instead of exiting
Date Sat, 14 May 2016 08:08:13 GMT

     [ https://issues.apache.org/jira/browse/HDFS-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Mingliang Liu updated HDFS-10397:
    Attachment: HDFS-10397.002.patch

Thanks [~yzhangal] for the insightful suggestion. This motivated my v2 patch.

The pain in the code to handle different option combinations comes from the fact that, for
each option we may validate and set it individually. This is not a clear way as 1) not efficient,
2) not well defined, and 3) error prone.
# For point 1) we validate the options multiple times which is not needed or scalable.
# For 2) some of the options are set after validation while the other options are set without
validation. Distributing the decision to validate or not to validate across all the setters
smells bad to me.
# For 3), when we validate an option, chances are that its dependent option B is not set yet.
This implies that the order of setting options have to be carefully chosen, leading to fragile
code snippet. Take {{syncFolder}} and {{skipCRC}} for example, skip CRC is valid only with
update options, and if we set (and thus validate) {{skipCRC}} before setting {{syncFolder}}
option, the validation will fail, even if both of them are provided in the command line.

I think a better way is to validate all the options only once after all the options are set,
i.e. a central validation method. Moreover, the parser is to parse the options and should
not handle the validation of option combinations explicitly, if it's possible to delegate
the work to {{validate()}} method of {{DistCpOptions}}. Of course, if there is any parsing
errors of a single option (eg. only one snapshot is provided for the {{-diff}} option), the
parser should throw the {{IllegalArgumentException}} directly.

What's your thought?

Ping [~jingzhao] for more input.

> Distcp should ignore -delete option if -diff option is provided instead of exiting
> ----------------------------------------------------------------------------------
>                 Key: HDFS-10397
>                 URL: https://issues.apache.org/jira/browse/HDFS-10397
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
>         Attachments: HDFS-10397.000.patch, HDFS-10397.001.patch, HDFS-10397.002.patch
> In distcp, {{-delete}} and {{-diff}} options are mutually exclusive. [HDFS-8828] brought
strictly checking which makes the existing applications (or scripts) that work just fine with
both {{-delete}} and {{-diff}} options previously stop performing because of the {{java.lang.IllegalArgumentException:
Diff is valid only with update options}} exception.
> To make it backward incompatible, we can ignore the {{-delete}} option, given {{-diff}}
option, instead of exiting the program. Along with that, we can print a warning message saying
that _Diff is valid only with update options, and -delete option is ignored_.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message