Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 600DD200C8C for ; Tue, 6 Jun 2017 10:42:00 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5F07B160BC6; Tue, 6 Jun 2017 08:42:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A6382160BC3 for ; Tue, 6 Jun 2017 10:41:59 +0200 (CEST) Received: (qmail 52414 invoked by uid 500); 6 Jun 2017 08:41:58 -0000 Mailing-List: contact issues-help@carbondata.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@carbondata.apache.org Delivered-To: mailing list issues@carbondata.apache.org Received: (qmail 52403 invoked by uid 99); 6 Jun 2017 08:41:58 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Jun 2017 08:41:58 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id A48DEDFF2D; Tue, 6 Jun 2017 08:41:58 +0000 (UTC) From: manishgupta88 To: issues@carbondata.apache.org Reply-To: issues@carbondata.apache.org Message-ID: Subject: [GitHub] carbondata pull request #996: [WIP] Executor lost failure in case of data lo... Content-Type: text/plain Date: Tue, 6 Jun 2017 08:41:58 +0000 (UTC) archived-at: Tue, 06 Jun 2017 08:42:00 -0000 GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/996 [WIP] Executor lost failure in case of data load failure due to bad records Problem: Executor lost failure in case of data load failure due to bad records Analysis: In case when we try to do data load with bad records continuously, after some time it is observed that executor is lost due to OOM error and application also gets restarted by yarn after some time. This happens because in case of data load failure due to bad records exception is thrown by the executor and task keeps retrying till the max number of retry attempts are reached. This keeps happening continuously and after some time application is restarted by yarn. Fix: When it is known that data load failure is due to bad records and it is an intentional failure from the carbon, then in that case executor should not retry for data load and complete the job gracefully and the failure information should be handled by the driver. You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/incubator-carbondata bad_record_failure_suppress Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/996.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #996 ---- commit 771ac3d22d0585f4ef26a8e38c35fc7a353a0ccf Author: manishgupta88 Date: 2017-06-06T06:48:35Z Problem: Executor lost failure in case of data load failure due to bad records Analysis: In case when we try to do data load with bad records continuously, after some time it is observed that executor is lost due to OOM error and application also gets restarted by yarn after some time. This happens because in case of data load failure due to bad records exception is thrown by the executor and task keeps retrying till the max number of retry attempts are reached. This keeps happening continuously and after some time application is restarted by yarn. Fix: When it is known that data load failure is due to bad records and it is an intentional failure from the carbon, then in that case executor should not retry for data load and complete the job gracefully and the failure information should be handled by the driver. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. ---