carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "xubo245 (JIRA)" <>
Subject [jira] [Created] (CARBONDATA-1541) There are some errors when bad_records_action is IGNORE
Date Mon, 09 Oct 2017 08:25:00 GMT
xubo245 created CARBONDATA-1541:

             Summary: There are some errors when bad_records_action is IGNORE
                 Key: CARBONDATA-1541
             Project: CarbonData
          Issue Type: Bug
          Components: data-load
    Affects Versions: 1.1.1
            Reporter: xubo245
            Priority: Minor

There are some errors when bad_records_action is IGNORE

17/10/09 01:20:31 ERROR CarbonRowDataWriterProcessorStepImpl: [Executor task launch worker-0][partitionID:default_int_table_2ade496b-a9e8-4e7c-82bd-fb21c2e590eb]
Failed for table: int_table in DataWriterProcessorStepImpl
org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException: unable to generate
the mdkey
	at org.apache.carbondata.processing.newflow.steps.CarbonRowDataWriterProcessorStepImpl.processBatch(
	at org.apache.carbondata.processing.newflow.steps.CarbonRowDataWriterProcessorStepImpl.doExecute(
	at org.apache.carbondata.processing.newflow.steps.CarbonRowDataWriterProcessorStepImpl.execute(
	at org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(
	at org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD$$anon$1.<init>(NewCarbonDataLoadRDD.scala:254)
	at org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD.internalCompute(NewCarbonDataLoadRDD.scala:229)
	at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:62)

1. When table only have one column and the column data is INT, there is an error:

        test("Loading table: int, bad_records_action is IGNORE") {
    val fileLocation = s"$rootPath/integration/spark-common-test/src/test/resources/badrecords/intTest.csv"
    sql("drop table if exists int_table")
    sql("CREATE TABLE if not exists int_table(intField INT) STORED BY 'carbondata'")
         | LOAD DATA LOCAL INPATH '$fileLocation'
         | INTO TABLE int_table
         | OPTIONS('FILEHEADER' = 'intField','bad_records_logger_enable'='true','bad_records_action'='IGNORE')

    sql("select * from int_table").show()
    checkAnswer(sql("select * from int_table where intField = 1"),
      Seq(Row(1), Row(1)))
    sql("drop table if exists int_table")

2. when sort_columns is null, there is an error :

  test("sort_columns is null, error") {
    sql("drop table if exists sales")
      """CREATE TABLE IF NOT EXISTS sales(ID BigInt, date Timestamp, country String,
          actual_price Double, Quantity int, sold_price Decimal(19,2))
          STORED BY 'carbondata'

        new File("./target/test/badRecords")

      .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "yyyy/MM/dd")
    var csvFilePath = s"$resourcesPath/badrecords/datasample.csv"
    sql("LOAD DATA local inpath '" + csvFilePath + "' INTO TABLE sales OPTIONS"
      "('bad_records_logger_enable'='true','bad_records_action'='redirect', 'DELIMITER'="
      " ',', 'QUOTECHAR'= '\"')");

      sql("select count(*) from sales"),

This message was sent by Atlassian JIRA

View raw message