From issues-return-153716-archive-asf-public=cust-asf.ponee.io@flink.apache.org Sun Feb 18 22:10:08 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 7DA3718064D for ; Sun, 18 Feb 2018 22:10:07 +0100 (CET) Received: (qmail 54750 invoked by uid 500); 18 Feb 2018 21:10:06 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 54741 invoked by uid 99); 18 Feb 2018 21:10:06 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Feb 2018 21:10:06 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 7F80CDFC25; Sun, 18 Feb 2018 21:10:05 +0000 (UTC) From: ChengzhiZhao To: issues@flink.apache.org Reply-To: issues@flink.apache.org Message-ID: Subject: [GitHub] flink pull request #5521: [FLINK-8599] Improve the failure behavior of the F... Content-Type: text/plain Date: Sun, 18 Feb 2018 21:10:05 +0000 (UTC) GitHub user ChengzhiZhao opened a pull request: https://github.com/apache/flink/pull/5521 [FLINK-8599] Improve the failure behavior of the FileInputFormat for … ## What is the purpose of the change This pull request is intent to improve the failure behavior of the ContinuousFileReader, currently if a bad file (for example, a different schema been dropped in this folder) came to the path and flink will do several retries. However, since the file path persist in the checkpoint, when people tried to resume from external checkpoint, it threw the following error on no file been found and the process cannot move forward. `java.io.IOException: Error opening the Input Split s3a://myfile [0,904]: No such file or directory: s3a://myfile` The change is to check if the path exist before open the file, if error occurs and bad file removed, flink should resume the process and continue. ## Brief change log - *Add a file exist check before open the file * ## Verifying this change - *Manually verified the change by introduce a bad file while continuously monitoring the folder, after remove the bad file, the process continued.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/ChengzhiZhao/flink Improve_failure_behavior_FileInputFormat Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5521.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5521 ---- commit 6fa8ef212c536acee56b2e9831ec92d1059449ff Author: Chengzhi Zhao Date: 2018-02-18T18:23:32Z [FLINK-8599] Improve the failure behavior of the FileInputFormat for bad files ---- ---