carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chenliang613 <...@git.apache.org>
Subject [GitHub] incubator-carbondata pull request #614: [CARBONDATA-714]Documented how to ha...
Date Mon, 06 Mar 2017 23:04:00 GMT
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/614#discussion_r104547671
  
    --- Diff: docs/faq.md ---
    @@ -18,30 +18,57 @@
     -->
     
     # FAQs
    -* **Auto Compaction not Working**
     
    -    The Property carbon.enable.auto.load.merge in carbon.properties need to be set to
true.
    +* [What are Bad Records?](#what-are-bad-records)
    +* [Where are Bad Records Stored in CarbonData?](#where-are-bad-records-stored-in-carbondata)
    +* [How to handle Bad Records?](#how-to-handle-bad-records)
    +* [How to resolve store location can’t be found?](#how-to-resolve-store-location-can-not-be-found)
    +* [What is Carbon Lock Type?](#what-is-carbon-lock-type)
    +* [How to resolve Abstract Method Error?](#how-to-resolve-abstract-method-error)
     
    -* **Getting Abstract method error**
    +## What are Bad Records?
    +Records that fail to get loaded into the CarbonData due to data type incompatibility
or are empty or have incompatible format are classified as Bad Records.
     
    -    You need to specify the spark version while using Maven to build project.
    +## Where are Bad Records Stored in CarbonData?
    +The bad records are stored at the location set in carbon.badRecords.location in carbon.properties
file.
    +By default **carbon.badRecords.location** specifies the following location ``/opt/Carbon/Spark/badrecords``.
     
    -* **Getting NotImplementedException for subquery using IN and EXISTS**
    +## How to handle Bad Records?
    +While loading data we can specify the approach to handle Bad Records. In order to analyse
the cause of the Bad Records the parameter ``BAD_RECORDS_LOGGER_ENABLE`` must be set to value
``TRUE``. There are three approaches to handle Bad Records which can be specified  by the
parameter ``BAD_RECORDS_ACTION``.
     
    -    Subquery with in and exists not supported in CarbonData.
    -    
    -* **Getting Exceptions on creating  a view**
    -    
    -    View not supported in CarbonData.
    -    
    -* **How to verify if ColumnGroups have been created as desired.**
    +- To pad the incorrect values of the csv rows with NULL value and load the data in CarbonData,
set the following in the query :
    +```
    +'BAD_RECORDS_ACTION'='FORCE'
    +```
    +
    +- To write the Bad Records without padding incorrect values with NULL in the raw csv
(set in the parameter **carbon.badRecords.location**), set the following in the query :
    +```
    +'BAD_RECORDS_ACTION'='REDIRECT'
    +```
    +
    +- To ignore the Bad Records from getting stored in the raw csv, we need to set the following
in the query :
    +```
    +'BAD_RECORDS_ACTION'='INDIRECT'
    +```
    +
    +## How to resolve store location can not be found?
    +The store location specified while creating carbon session is used by the CarbonData
to store the meta data like the schema, dictionary files, dictionary meta data and sort indexes.
    +
    +Try creating ``carbonsession`` with ``storepath`` specified in the following manner :
    +```
    +val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession(<store_path>)
    +```
    +Example:
    +```
    +val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://localhost:9000/carbon/store
")
    +```
    +
    +## What is Carbon Lock Type?
    --- End diff --
    
    For users, which scenario need to set this parameter for lock?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message