hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mithun Radhakrishnan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12473) distcp's ignoring failures option should be mutually exclusive with the atomic option
Date Tue, 22 Dec 2015 19:39:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068624#comment-15068624

Mithun Radhakrishnan commented on HADOOP-12473:

[~jira.shegalov], that is an interesting take. Hmm.

Between you and me, I think no one should be using {{-i}} at all, in atomic copies or otherwise.
It was included to be backward compatible with DistCpV1, for those with an inexplicable tolerance
for bad data. :]

{{-atomic}} was added so that users have the choice of staging their copies to a temp-location,
before atomically moving them to the target location. I guessed there might be users who'd
want to stage data before moving them, but could also tolerate bad copies. But I do see your
point of view.

{{-i}} could be useful to work around annoying copy errors. For instance, there was a time
when {{-skipCrc}} wouldn't work correctly, and copying files with different block-sizes (or
empty files) would result in CRC failures. {{-i}} would let workflows complete while DistCp
was under fix. Removing this makes the workaround unavailable when {{-atomic}} is used.

I'm on the fence here, but tending in your direction. I'd be happy to go along, if you could
another "Aye!" from a committer. Paging [~jlowe] and [~daryn].

> distcp's ignoring failures option should be mutually exclusive with the atomic option
> -------------------------------------------------------------------------------------
>                 Key: HADOOP-12473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12473
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: tools/distcp
>    Affects Versions: 2.7.1
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
>             Fix For: 2.8.0
> In {{CopyMapper::handleFailure}}, the mapper handles failure and will ignore it if no
it's config key is on. Ignoring failures option {{-i}} should be mutually exclusive with the
{{-atomic}} option otherwise an incomplete dir is eligible for commit defeating the purpose.

This message was sent by Atlassian JIRA

View raw message