Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A0395182F5 for ; Mon, 21 Mar 2016 17:53:25 +0000 (UTC) Received: (qmail 62339 invoked by uid 500); 21 Mar 2016 17:53:23 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 62274 invoked by uid 500); 21 Mar 2016 17:53:23 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 62264 invoked by uid 99); 21 Mar 2016 17:53:23 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Mar 2016 17:53:23 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 582E2C6265 for ; Mon, 21 Mar 2016 17:53:23 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.798 X-Spam-Level: ** X-Spam-Status: No, score=2.798 tagged_above=-999 required=6.31 tests=[FSL_HELO_BARE_IP_2=1.499, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id RiKOpdsyUAR3 for ; Mon, 21 Mar 2016 17:53:21 +0000 (UTC) Received: from relayvx12c.securemail.intermedia.net (relayvx12c.securemail.intermedia.net [64.78.52.187]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 856B15F245 for ; Mon, 21 Mar 2016 17:53:20 +0000 (UTC) Received: from securemail.intermedia.net (localhost [127.0.0.1]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by emg-ca-1-2.localdomain (Postfix) with ESMTPS id A41ED53E55; Mon, 21 Mar 2016 10:53:19 -0700 (PDT) Subject: Re: Error selecting from a Hive ORC table in Spark-sql MIME-Version: 1.0 x-echoworx-msg-id: feb071c7-545b-43ac-92fc-eda7141f5e2d x-echoworx-emg-received: Mon, 21 Mar 2016 10:53:19.607 -0700 x-echoworx-message-code-hashed: 8a2a7ad3c318ff042c0a04fbad5469fef5c2e6f00d2cc7b6a7f7e6f0d0066e89 x-echoworx-action: delivered Received: from 10.254.155.17 ([10.254.155.17]) by emg-ca-1-2 (JAMES SMTP Server 2.3.2) with SMTP ID 77; Mon, 21 Mar 2016 10:53:19 -0700 (PDT) Received: from MBX080-W3-CO-2.exch080.serverpod.net (unknown [10.224.117.53]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by emg-ca-1-2.localdomain (Postfix) with ESMTPS id 52E6A53E55; Mon, 21 Mar 2016 10:53:19 -0700 (PDT) Received: from MBX080-W3-CO-1.exch080.serverpod.net (10.224.117.52) by MBX080-W3-CO-2.exch080.serverpod.net (10.224.117.53) with Microsoft SMTP Server (TLS) id 15.0.1130.7; Mon, 21 Mar 2016 10:53:18 -0700 Received: from MBX080-W3-CO-1.exch080.serverpod.net ([10.224.117.52]) by MBX080-W3-CO-1.exch080.serverpod.net ([169.254.1.106]) with mapi id 15.00.1130.005; Mon, 21 Mar 2016 10:53:18 -0700 From: Eugene Koifman To: "user@hive.apache.org" , "user @spark" Thread-Topic: Error selecting from a Hive ORC table in Spark-sql Thread-Index: AQHRg5A+0txTg7Fg5EuECToMyJp+yJ9kLfWA Date: Mon, 21 Mar 2016 17:53:17 +0000 Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [192.175.27.10] x-source-routing-agent: Processed Content-Type: multipart/alternative; boundary="_000_D315815E4B51Aekoifmanhortonworkscom_" --_000_D315815E4B51Aekoifmanhortonworkscom_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable The system thinks t2 is an Acid table but the files on disk don=92t follow = the convention acid system would expect. Perhaps Xuefu Zhang would know more on Spark/Aicd integration. From: Mich Talebzadeh > Reply-To: "user@hive.apache.org" > Date: Monday, March 21, 2016 at 9:39 AM To: "user @spark" >, us= er > Subject: Error selecting from a Hive ORC table in Spark-sql Hi, Do we know the cause of this error when selecting from an Hive ORC table spark-sql> select * from t2; 16/03/21 16:38:33 ERROR SparkSQLDriver: Failed in [select * from t2] java.lang.RuntimeException: serious problem at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsIn= fo(OrcInputFormat.java:1021) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInp= utFormat.java:1048) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207= ) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 39) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 37) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartition= sRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 39) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 37) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartition= sRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 39) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 37) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartition= sRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 39) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 37) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationSc= ope.scala:147) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationSc= ope.scala:108) at org.apache.spark.rdd.RDD.withScope(RDD.scala:310) at org.apache.spark.rdd.RDD.collect(RDD.scala:908) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPla= n.scala:177) at org.apache.spark.sql.hive.HiveContext$QueryExecution.stringResul= t(HiveContext.scala:587) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkS= QLDriver.scala:63) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.process= Cmd(SparkSQLCLIDriver.scala:308) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:= 376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(S= parkSQLCLIDriver.scala:226) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(Sp= arkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessor= Impl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethod= AccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$Spa= rkSubmit$$runMain(SparkSubmit.scala:674) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.sca= la:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:20= 5) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatE= xception: For input string: "0000039_0000" at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:25= 2) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsIn= fo(OrcInputFormat.java:998) ... 43 more Caused by: java.lang.NumberFormatException: For input string: "0000039_0000= " at java.lang.NumberFormatException.forInputString(NumberFormatExcep= tion.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.hive.ql.io.AcidUtils.parseDelta(AcidUtils.java= :310) at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.ja= va:379) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.ca= ll(OrcInputFormat.java:634) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.ca= ll(OrcInputFormat.java:620) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:33= 4) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExec= utor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:615) at java.lang.Thread.run(Thread.java:724) java.lang.RuntimeException: serious problem at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsIn= fo(OrcInputFormat.java:1021) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInp= utFormat.java:1048) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207= ) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 39) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 37) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartition= sRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 39) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 37) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartition= sRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 39) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 37) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartition= sRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 39) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:2= 37) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationSc= ope.scala:147) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationSc= ope.scala:108) at org.apache.spark.rdd.RDD.withScope(RDD.scala:310) at org.apache.spark.rdd.RDD.collect(RDD.scala:908) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPla= n.scala:177) at org.apache.spark.sql.hive.HiveContext$QueryExecution.stringResul= t(HiveContext.scala:587) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkS= QLDriver.scala:63) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.process= Cmd(SparkSQLCLIDriver.scala:308) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:= 376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(S= parkSQLCLIDriver.scala:226) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(Sp= arkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessor= Impl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethod= AccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$Spa= rkSubmit$$runMain(SparkSubmit.scala:674) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.sca= la:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:20= 5) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatE= xception: For input string: "0000039_0000" at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:25= 2) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsIn= fo(OrcInputFormat.java:998) ... 43 more Caused by: java.lang.NumberFormatException: For input string: "0000039_0000= " at java.lang.NumberFormatException.forInputString(NumberFormatExcep= tion.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.hive.ql.io.AcidUtils.parseDelta(AcidUtils.java= :310) at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.ja= va:379) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.ca= ll(OrcInputFormat.java:634) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.ca= ll(OrcInputFormat.java:620) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:33= 4) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExec= utor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:615) at java.lang.Thread.run(Thread.java:724) Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=3DAAEAAAAWh2gBxianrbJd6z= P6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com --_000_D315815E4B51Aekoifmanhortonworkscom_ Content-Type: text/html; charset="Windows-1252" Content-ID: <826B5979A99D534D96B2F3142CC24374@exch080.serverpod.net> Content-Transfer-Encoding: quoted-printable
The system thinks t2 is an Acid table but the files on disk don=92t fo= llow the convention acid system would expect.
Perhaps Xuefu Zhang would know more on Spark/Aicd integration.

From: Mich Talebzadeh <mich.talebzadeh@gmail.com>
Reply-To: "user@hive.apache.org" <user@hive.apache.org>
Date: Monday, March 21, 2016 at 9:3= 9 AM
To: "user @spark" <user@spark.apache.org>, user &= lt;user@hive.apache.org>
Subject: Error selecting from a Hiv= e ORC table in Spark-sql

Hi,

Do we know the cause of this error when selecting from an Hive ORC tab= le

spark-sql> select * from t2;
16/03/21 16:38:33 ERROR SparkSQLDriver: Failed in [select * from t= 2]
java.lang.RuntimeException: serious problem
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
        at org.apache.spark.rdd.HadoopRD= D.getPartitions(HadoopRDD.scala:207)
   = ;     at org.apache.spark.rdd.RDD$$anonfun$partitions$2= .apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option= .scala:120)
        at org.apache.spark.rdd.RDD.part= itions(RDD.scala:237)
        at org.apache.spark.rdd.MapParti= tionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option= .scala:120)
        at org.apache.spark.rdd.RDD.part= itions(RDD.scala:237)
        at org.apache.spark.rdd.MapParti= tionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option= .scala:120)
        at org.apache.spark.rdd.RDD.part= itions(RDD.scala:237)
        at org.apache.spark.rdd.MapParti= tionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option= .scala:120)
        at org.apache.spark.rdd.RDD.part= itions(RDD.scala:237)
        at org.apache.spark.SparkContext= .runJob(SparkContext.scala:1921)
        at org.apache.spark.rdd.RDD$$ano= nfun$collect$1.apply(RDD.scala:909)
        at org.apache.spark.rdd.RDDOpera= tionScope$.withScope(RDDOperationScope.scala:147)
        at org.apache.spark.rdd.RDDOpera= tionScope$.withScope(RDDOperationScope.scala:108)
        at org.apache.spark.rdd.RDD.with= Scope(RDD.scala:310)
        at org.apache.spark.rdd.RDD.coll= ect(RDD.scala:908)
        at org.apache.spark.sql.executio= n.SparkPlan.executeCollect(SparkPlan.scala:177)
        at org.apache.spark.sql.hive.Hiv= eContext$QueryExecution.stringResult(HiveContext.scala:587)
        at org.apache.spark.sql.hive.thr= iftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63)
        at org.apache.spark.sql.hive.thr= iftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:308)
        at org.apache.hadoop.hive.cli.Cl= iDriver.processLine(CliDriver.java:376)
        at org.apache.spark.sql.hive.thr= iftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226)
        at org.apache.spark.sql.hive.thr= iftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAcces= sorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAcces= sorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodA= ccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invo= ke(Method.java:606)
        at org.apache.spark.deploy.Spark= Submit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)=
        at org.apache.spark.deploy.Spark= Submit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.Spark= Submit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.Spark= Submit$.main(SparkSubmit.scala:120)
        at org.apache.spark.deploy.Spark= Submit.main(SparkSubmit.scala)
Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatE= xception: For input string: "0000039_0000"
        at java.util.concurrent.FutureTa= sk$Sync.innerGet(FutureTask.java:252)
        at java.util.concurrent.FutureTa= sk.get(FutureTask.java:111)
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998)
        ... 43 more
Caused by: java.lang.NumberFormatException: For input string: "0000039= _0000"
        at java.lang.NumberFormatExcepti= on.forInputString(NumberFormatException.java:65)
        at java.lang.Long.parseLong(Long= .java:441)
        at java.lang.Long.parseLong(Long= .java:483)
        at org.apache.hadoop.hive.ql.io.= AcidUtils.parseDelta(AcidUtils.java:310)
        at org.apache.hadoop.hive.ql.io.= AcidUtils.getAcidState(AcidUtils.java:379)
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:634)
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:620)
        at java.util.concurrent.FutureTa= sk$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTa= sk.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPo= olExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPo= olExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.j= ava:724)
java.lang.RuntimeException: serious problem
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
        at org.apache.spark.rdd.HadoopRD= D.getPartitions(HadoopRDD.scala:207)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option= .scala:120)
        at org.apache.spark.rdd.RDD.part= itions(RDD.scala:237)
        at org.apache.spark.rdd.MapParti= tionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option= .scala:120)
        at org.apache.spark.rdd.RDD.part= itions(RDD.scala:237)
        at org.apache.spark.rdd.MapParti= tionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option= .scala:120)
        at org.apache.spark.rdd.RDD.part= itions(RDD.scala:237)
        at org.apache.spark.rdd.MapParti= tionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$ano= nfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option= .scala:120)
        at org.apache.spark.rdd.RDD.part= itions(RDD.scala:237)
        at org.apache.spark.SparkContext= .runJob(SparkContext.scala:1921)
        at org.apache.spark.rdd.RDD$$ano= nfun$collect$1.apply(RDD.scala:909)
        at org.apache.spark.rdd.RDDOpera= tionScope$.withScope(RDDOperationScope.scala:147)
        at org.apache.spark.rdd.RDDOpera= tionScope$.withScope(RDDOperationScope.scala:108)
        at org.apache.spark.rdd.RDD.with= Scope(RDD.scala:310)
        at org.apache.spark.rdd.RDD.coll= ect(RDD.scala:908)
        at org.apache.spark.sql.executio= n.SparkPlan.executeCollect(SparkPlan.scala:177)
        at org.apache.spark.sql.hive.Hiv= eContext$QueryExecution.stringResult(HiveContext.scala:587)
        at org.apache.spark.sql.hive.thr= iftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63)
        at org.apache.spark.sql.hive.thr= iftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:308)
        at org.apache.hadoop.hive.cli.Cl= iDriver.processLine(CliDriver.java:376)
        at org.apache.spark.sql.hive.thr= iftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226)
        at org.apache.spark.sql.hive.thr= iftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAcces= sorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAcces= sorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodA= ccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invo= ke(Method.java:606)
        at org.apache.spark.deploy.Spark= Submit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)=
        at org.apache.spark.deploy.Spark= Submit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.Spark= Submit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.Spark= Submit$.main(SparkSubmit.scala:120)
        at org.apache.spark.deploy.Spark= Submit.main(SparkSubmit.scala)
Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatE= xception: For input string: "0000039_0000"
        at java.util.concurrent.FutureTa= sk$Sync.innerGet(FutureTask.java:252)
        at java.util.concurrent.FutureTa= sk.get(FutureTask.java:111)
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998)
        ... 43 more
Caused by: java.lang.NumberFormatException: For input string: "0000039= _0000"
        at java.lang.NumberFormatExcepti= on.forInputString(NumberFormatException.java:65)
        at java.lang.Long.parseLong(Long= .java:441)
        at java.lang.Long.parseLong(Long= .java:483)
        at org.apache.hadoop.hive.ql.io.= AcidUtils.parseDelta(AcidUtils.java:310)
        at org.apache.hadoop.hive.ql.io.= AcidUtils.getAcidState(AcidUtils.java:379)
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:634)
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:620)
        at java.util.concurrent.FutureTa= sk$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTa= sk.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPo= olExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPo= olExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.j= ava:724)



--_000_D315815E4B51Aekoifmanhortonworkscom_--