Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 935D4200C3E for ; Tue, 7 Mar 2017 00:04:04 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9229F160B81; Mon, 6 Mar 2017 23:04:04 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DB957160B76 for ; Tue, 7 Mar 2017 00:04:03 +0100 (CET) Received: (qmail 69187 invoked by uid 500); 6 Mar 2017 23:04:03 -0000 Mailing-List: contact issues-help@carbondata.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@carbondata.incubator.apache.org Delivered-To: mailing list issues@carbondata.incubator.apache.org Received: (qmail 69178 invoked by uid 99); 6 Mar 2017 23:04:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Mar 2017 23:04:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id B245DC1274 for ; Mon, 6 Mar 2017 23:04:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.021 X-Spam-Level: X-Spam-Status: No, score=-4.021 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id TO1BpdDrMObz for ; Mon, 6 Mar 2017 23:04:01 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id EB10A5F23D for ; Mon, 6 Mar 2017 23:04:00 +0000 (UTC) Received: (qmail 69168 invoked by uid 99); 6 Mar 2017 23:04:00 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Mar 2017 23:04:00 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 0DF93DFDE4; Mon, 6 Mar 2017 23:04:00 +0000 (UTC) From: chenliang613 To: issues@carbondata.incubator.apache.org Reply-To: issues@carbondata.incubator.apache.org References: In-Reply-To: Subject: [GitHub] incubator-carbondata pull request #614: [CARBONDATA-714]Documented how to ha... Content-Type: text/plain Message-Id: <20170306230400.0DF93DFDE4@git1-us-west.apache.org> Date: Mon, 6 Mar 2017 23:04:00 +0000 (UTC) archived-at: Mon, 06 Mar 2017 23:04:04 -0000 Github user chenliang613 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/614#discussion_r104547671 --- Diff: docs/faq.md --- @@ -18,30 +18,57 @@ --> # FAQs -* **Auto Compaction not Working** - The Property carbon.enable.auto.load.merge in carbon.properties need to be set to true. +* [What are Bad Records?](#what-are-bad-records) +* [Where are Bad Records Stored in CarbonData?](#where-are-bad-records-stored-in-carbondata) +* [How to handle Bad Records?](#how-to-handle-bad-records) +* [How to resolve store location can’t be found?](#how-to-resolve-store-location-can-not-be-found) +* [What is Carbon Lock Type?](#what-is-carbon-lock-type) +* [How to resolve Abstract Method Error?](#how-to-resolve-abstract-method-error) -* **Getting Abstract method error** +## What are Bad Records? +Records that fail to get loaded into the CarbonData due to data type incompatibility or are empty or have incompatible format are classified as Bad Records. - You need to specify the spark version while using Maven to build project. +## Where are Bad Records Stored in CarbonData? +The bad records are stored at the location set in carbon.badRecords.location in carbon.properties file. +By default **carbon.badRecords.location** specifies the following location ``/opt/Carbon/Spark/badrecords``. -* **Getting NotImplementedException for subquery using IN and EXISTS** +## How to handle Bad Records? +While loading data we can specify the approach to handle Bad Records. In order to analyse the cause of the Bad Records the parameter ``BAD_RECORDS_LOGGER_ENABLE`` must be set to value ``TRUE``. There are three approaches to handle Bad Records which can be specified by the parameter ``BAD_RECORDS_ACTION``. - Subquery with in and exists not supported in CarbonData. - -* **Getting Exceptions on creating a view** - - View not supported in CarbonData. - -* **How to verify if ColumnGroups have been created as desired.** +- To pad the incorrect values of the csv rows with NULL value and load the data in CarbonData, set the following in the query : +``` +'BAD_RECORDS_ACTION'='FORCE' +``` + +- To write the Bad Records without padding incorrect values with NULL in the raw csv (set in the parameter **carbon.badRecords.location**), set the following in the query : +``` +'BAD_RECORDS_ACTION'='REDIRECT' +``` + +- To ignore the Bad Records from getting stored in the raw csv, we need to set the following in the query : +``` +'BAD_RECORDS_ACTION'='INDIRECT' +``` + +## How to resolve store location can not be found? +The store location specified while creating carbon session is used by the CarbonData to store the meta data like the schema, dictionary files, dictionary meta data and sort indexes. + +Try creating ``carbonsession`` with ``storepath`` specified in the following manner : +``` +val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession() +``` +Example: +``` +val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://localhost:9000/carbon/store ") +``` + +## What is Carbon Lock Type? --- End diff -- For users, which scenario need to set this parameter for lock? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. ---