nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Evan Reynolds (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (NIFI-6367) FetchS3Processor responds to md5 error on download by doing download again, again, and again
Date Tue, 02 Jul 2019 00:30:00 GMT

    [ https://issues.apache.org/jira/browse/NIFI-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876494#comment-16876494
] 

Evan Reynolds edited comment on NIFI-6367 at 7/2/19 12:29 AM:
--------------------------------------------------------------

[~kefevs] - that did help! Thank you!

It didn't throw the handled exceptions in your case, it threw an exception type that tells
NiFi to reprocess the flowfile. 

I added two extra error checks - a null (as I could see that happen when testing) and also
to check that exception to see if we should really retry or not -

https://github.com/apache/nifi/pull/3563

I think that will fix it up.


was (Author: evanthx):
[~kefevs] - that did help! Thank you!

It didn't throw the handled exceptions in your case, it threw an exception type that tells
NiFi to reprocess the flowfile. 

I added two extra error checks - a null (as I could see that happen when testing) and also
to check that exception to see if we should really retry or not -
[https://github.com/apache/nifi/pull/3562]

I think that will fix it up.

> FetchS3Processor responds to md5 error on download by doing download again, again, and
again
> --------------------------------------------------------------------------------------------
>
>                 Key: NIFI-6367
>                 URL: https://issues.apache.org/jira/browse/NIFI-6367
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.7.1
>         Environment: NIFI (CentOS 7.2) with FetchS3Object running towards S3 enviroment
(non public). Enviroment / S3 had errors that introduced md5 errors on sub 0.5% of downloads.
Downloads with md5 errors accumulated in the input que of the processor.
>            Reporter: Kefevs Pirkibo
>            Assignee: Evan Reynolds
>            Priority: Critical
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> (6months old, but don't see changes in the relevant parts of the code, though I might
be mistaken. This might be hard to replicate, so suggest a code wizard check if this is still
a problem. )
> Case: NIFI running with FetchS3Object processor(s) towards S3 enviroment (non public).
The enviroment and S3 had in combination hardware errors that resulted in sporadic md5 errors
on the same files over and over again. Md5 errors resulted in an unhandled AmazonClientException,
and the file was downloaded yet again. (Reverted to the input que, first in line.) In our
case this was identified after a number of days, with substantial bandwidth usage. It did
not help that the FetchS3Objects where running with multiple instances, and after days accumulated
the bad md5 checksum files for continuous download.
> Suggest: Someone code savy check what happens to files that are downloaded with bad md5,
if they are reverted to the que due to uncought exception or other means, then this is still
a potential problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message