Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A4638200C7F for ; Tue, 9 May 2017 08:27:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A3059160BB6; Tue, 9 May 2017 06:27:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A4A23160BB3 for ; Tue, 9 May 2017 08:27:08 +0200 (CEST) Received: (qmail 75979 invoked by uid 500); 9 May 2017 06:27:07 -0000 Mailing-List: contact issues-help@carbondata.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@carbondata.apache.org Delivered-To: mailing list issues@carbondata.apache.org Received: (qmail 75965 invoked by uid 99); 9 May 2017 06:27:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 May 2017 06:27:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 6594CC047C for ; Tue, 9 May 2017 06:27:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id nYVa5YXSwQD9 for ; Tue, 9 May 2017 06:27:05 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id F3A9B5FD15 for ; Tue, 9 May 2017 06:27:04 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 71976E0272 for ; Tue, 9 May 2017 06:27:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2D20721B53 for ; Tue, 9 May 2017 06:27:04 +0000 (UTC) Date: Tue, 9 May 2017 06:27:04 +0000 (UTC) From: "Ravindra Pesala (JIRA)" To: issues@carbondata.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (CARBONDATA-664) Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 09 May 2017 06:27:09 -0000 [ https://issues.apache.org/jira/browse/CARBONDATA-664?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala resolved CARBONDATA-664. ---------------------------------------- Resolution: Duplicate Duplicated to CARBONDATA-726 > Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load que= ry. > -------------------------------------------------------------------------= --- > > Key: CARBONDATA-664 > URL: https://issues.apache.org/jira/browse/CARBONDATA-664 > Project: CarbonData > Issue Type: Bug > Components: data-query > Affects Versions: 1.0.0-incubating > Environment: Spark 1.6 > Reporter: Harsh Sharma > Labels: bug > Attachments: 100_olap_C20.csv, Driver Logs, Executor Logs > > Time Spent: 50m > Remaining Estimate: 0h > > Below scenario is working on Spark 2.1, but not on Spark 1.6 > create table VMALL_DICTIONARY_INCLUDE (imei string,deviceInformationId in= t,MAC string,deviceColor string,device_backColor string,modelId string,mark= etName string,AMSize string,ROMSize string,CUPAudit string,CPIClocked strin= g,series string,productionDate timestamp,bomCode string,internalModels stri= ng, deliveryTime string, channelsId string, channelsName string , deliveryA= reaId string, deliveryCountry string, deliveryProvince string, deliveryCity= string,deliveryDistrict string, deliveryStreet string, oxSingleNumber stri= ng, ActiveCheckTime string, ActiveAreaId string, ActiveCountry string, Acti= veProvince string, Activecity string, ActiveDistrict string, ActiveStreet s= tring, ActiveOperatorId string, Active_releaseId string, Active_EMUIVersion= string, Active_operaSysVersion string, Active_BacVerNumber string, Active_= BacFlashVer string, Active_webUIVersion string, Active_webUITypeCarrVer str= ing,Active_webTypeDataVerNumber string, Active_operatorsVersion string, Act= ive_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, = Latest_DAY Decimal(30,10), Latest_HOUR string, Latest_areaId string, Latest= _country string, Latest_province string, Latest_city string, Latest_distric= t string, Latest_street string, Latest_releaseId string, Latest_EMUIVersion= string, Latest_operaSysVersion string, Latest_BacVerNumber string, Latest_= BacFlashVer string, Latest_webUIVersion string, Latest_webUITypeCarrVer str= ing, Latest_webTypeDataVerNumber string, Latest_operatorsVersion string, La= test_phonePADPartitionedVersions string, Latest_operatorId string, gamePoin= tDescription string,gamePointId double,contractNumber BigInt) STORED BY 'or= g.apache.carbondata.format' TBLPROPERTIES('DICTIONARY_INCLUDE'=3D'imei,devi= ceInformationId,productionDate,gamePointId,Latest_DAY,contractNumber'); > LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/100_olap_C20.csv' INTO = table VMALL_DICTIONARY_INCLUDE options('DELIMITER'=3D',','QUOTECHAR'=3D'"',= 'BAD_RECORDS_ACTION'=3D'FORCE','FILEHEADER'=3D'imei,deviceInformationId,MAC= ,deviceColor,device_backColor,modelId,marketName,AMSize,ROMSize,CUPAudit,CP= IClocked,series,productionDate,bomCode,internalModels,deliveryTime,channels= Id,channelsName,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCit= y,deliveryDistrict,deliveryStreet,oxSingleNumber,contractNumber,ActiveCheck= Time,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,Ac= tiveStreet,ActiveOperatorId,Active_releaseId,Active_EMUIVersion,Active_oper= aSysVersion,Active_BacVerNumber,Active_BacFlashVer,Active_webUIVersion,Acti= ve_webUITypeCarrVer,Active_webTypeDataVerNumber,Active_operatorsVersion,Act= ive_phonePADPartitionedVersions,Latest_YEAR,Latest_MONTH,Latest_DAY,Latest_= HOUR,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_distri= ct,Latest_street,Latest_releaseId,Latest_EMUIVersion,Latest_operaSysVersion= ,Latest_BacVerNumber,Latest_BacFlashVer,Latest_webUIVersion,Latest_webUITyp= eCarrVer,Latest_webTypeDataVerNumber,Latest_operatorsVersion,Latest_phonePA= DPartitionedVersions,Latest_operatorId,gamePointId,gamePointDescription'); > select sum(deviceinformationId) from VMALL_DICTIONARY_INCLUDE where devic= eColor =3D'5Device Color' and modelId !=3D '109' or Latest_DAY > '123456789= 0123540.0000000000' and contractNumber =3D=3D '92233720368547800' or Active= _operaSysVersion like 'Operating System Version' and gamePointId <=3D> '8.1= 366141918611E39' and deviceInformationId < '1000000' and productionDate not= like '2016-07-01' and imei is null and Latest_HOUR is not null and channel= sId <=3D '7' and Latest_releaseId >=3D '1' and Latest_MONTH between 6 and 8= and Latest_YEAR not between 2016 and 2017 and Latest_HOUR RLIKE '12' and g= amePointDescription REGEXP 'Site' and imei in ('1AA1','1AA100','1AA10','1AA= 1000','1AA10000','1AA100000','1AA1000000','1AA100001','1AA100002','1AA10000= 4','','NULL') and Active_BacVerNumber not in ('Background version number1',= '','null'); > This scenario results in the following exception, > Error: org.apache.spark.SparkException: Job aborted due to stage failure:= Task 0 in stage 48.0 failed 4 times, most recent failure: Lost task 0.3 in= stage 48.0 (TID 152, hadoop-master): java.lang.RuntimeException: java.util= .concurrent.ExecutionException: java.lang.NullPointerException > =09at org.apache.carbondata.scan.result.iterator.DetailQueryResultIterato= r.next(DetailQueryResultIterator.java:65) > =09at org.apache.carbondata.scan.result.iterator.DetailQueryResultIterato= r.next(DetailQueryResultIterator.java:35) > =09at org.apache.carbondata.scan.result.iterator.ChunkRowIterator.(= ChunkRowIterator.java:43) > =09at org.apache.carbondata.hadoop.CarbonRecordReader.initialize(CarbonRe= cordReader.java:81) > =09at org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(CarbonScanRDD= .scala:194) > =09at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > =09at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > =09at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scal= a:38) > =09at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > =09at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > =09at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scal= a:38) > =09at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > =09at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > =09at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scal= a:38) > =09at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > =09at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > =09at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scal= a:38) > =09at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > =09at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > =09at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scal= a:38) > =09at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > =09at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > =09at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scal= a:38) > =09at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > =09at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > =09at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scal= a:38) > =09at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > =09at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > =09at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.sc= ala:73) > =09at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.sc= ala:41) > =09at org.apache.spark.scheduler.Task.run(Task.scala:89) > =09at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:22= 7) > =09at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecuto= r.java:1142) > =09at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecut= or.java:617) > =09at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointer= Exception > =09at java.util.concurrent.FutureTask.report(FutureTask.java:122) > =09at java.util.concurrent.FutureTask.get(FutureTask.java:192) > =09at org.apache.carbondata.scan.result.iterator.DetailQueryResultIterato= r.next(DetailQueryResultIterator.java:52) > =09... 34 more > Caused by: java.lang.NullPointerException > =09at org.apache.carbondata.scan.result.AbstractScannedResult.getDictiona= ryKeyIntegerArray(AbstractScannedResult.java:187) > =09at org.apache.carbondata.scan.result.impl.FilterQueryScannedResult.get= DictionaryKeyIntegerArray(FilterQueryScannedResult.java:53) > =09at org.apache.carbondata.scan.collector.impl.DictionaryBasedResultColl= ector.collectData(DictionaryBasedResultCollector.java:111) > =09at org.apache.carbondata.scan.processor.impl.DataBlockIteratorImpl.nex= t(DataBlockIteratorImpl.java:52) > =09at org.apache.carbondata.scan.processor.impl.DataBlockIteratorImpl.nex= t(DataBlockIteratorImpl.java:33) > =09at org.apache.carbondata.scan.result.iterator.DetailQueryResultIterato= r$1.call(DetailQueryResultIterator.java:78) > =09at org.apache.carbondata.scan.result.iterator.DetailQueryResultIterato= r$1.call(DetailQueryResultIterator.java:72) > =09at java.util.concurrent.FutureTask.run(FutureTask.java:266) > =09... 3 more > Driver stacktrace: (state=3D,code=3D0) -- This message was sent by Atlassian JIRA (v6.3.15#6346)