From issues-return-180783-archive-asf-public=cust-asf.ponee.io@spark.apache.org Sat Jan 6 06:17:07 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 812F4180647 for ; Sat, 6 Jan 2018 06:17:07 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 70D69160C28; Sat, 6 Jan 2018 05:17:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 40632160C27 for ; Sat, 6 Jan 2018 06:17:06 +0100 (CET) Received: (qmail 80002 invoked by uid 500); 6 Jan 2018 05:17:05 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 79993 invoked by uid 99); 6 Jan 2018 05:17:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 06 Jan 2018 05:17:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id E4B221807A1 for ; Sat, 6 Jan 2018 05:17:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -107.911 X-Spam-Level: X-Spam-Status: No, score=-107.911 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id UhsQBcfzCEJw for ; Sat, 6 Jan 2018 05:17:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id C5CDF5FB16 for ; Sat, 6 Jan 2018 05:17:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 0BD44E0F13 for ; Sat, 6 Jan 2018 05:17:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4F4C9240F2 for ; Sat, 6 Jan 2018 05:17:00 +0000 (UTC) Date: Sat, 6 Jan 2018 05:17:00 +0000 (UTC) From: "wuyi (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SPARK-22967) VersionSuite failed on Windows caused by unescapeSQLString() MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SPARK-22967?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D163= 14380#comment-16314380 ]=20 wuyi commented on SPARK-22967: ------------------------------ I'd like to open a PR, but I'm not 100% sure how to fix this bug yet. As yo= u say: {code:java} fix is about replacing the path to URI form {code} But this Windows' path goes wrong before the stringToURI() called (as I men= tioned above). So, should we fix it before URI transform happen ? > VersionSuite failed on Windows caused by unescapeSQLString() > ------------------------------------------------------------ > > Key: SPARK-22967 > URL: https://issues.apache.org/jira/browse/SPARK-22967 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.2.1 > Environment: Windos7 > Reporter: wuyi > Priority: Minor > Labels: build, test, windows > > On Windows system, two unit test case would fail while running VersionSui= te ("A simple set of tests that call the methods of a `HiveClient`, loading= different version of hive from maven central.") > Failed A : test(s"$version: read avro file containing decimal")=20 > {code:java} > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:j= ava.lang.IllegalArgumentException: Can not create a Path from an empty stri= ng); > {code} > Failed B: test(s"$version: SPARK-17920: Insert into/overwrite avro table"= ) > {code:java} > Unable to infer the schema. The schema specification is required to creat= e the table `default`.`tab2`.; > org.apache.spark.sql.AnalysisException: Unable to infer the schema. The s= chema specification is required to create the table `default`.`tab2`.; > {code} > As I deep into this problem, I found it is related to ParserUtils#unescap= eSQLString(). > These are two lines at the beginning of Failed A: > {code:java} > val url =3D Thread.currentThread().getContextClassLoader.getResource("avr= oDecimal") > val location =3D new File(url.getFile) > {code} > And in my environment=EF=BC=8C`location` (path value) is > {code:java} > D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\a= vroDecimal > {code} > And then, in SparkSqlParser#visitCreateHiveTable()#L1128: > {code:java} > val location =3D Option(ctx.locationSpec).map(visitLocationSpec) > {code} > This line want to get LocationSepcContext's content first, which is equal= to `location` above. > Then, the content is passed to visitLocationSpec(), and passed to unescap= eSQLString() > finally. > Lets' have a look at unescapeSQLString(): > {code:java} > /** Unescape baskslash-escaped string enclosed by quotes. */ > def unescapeSQLString(b: String): String =3D { > var enclosure: Character =3D null > val sb =3D new StringBuilder(b.length()) > def appendEscapedChar(n: Char) { > n match { > case '0' =3D> sb.append('\u0000') > case '\'' =3D> sb.append('\'') > case '"' =3D> sb.append('\"') > case 'b' =3D> sb.append('\b') > case 'n' =3D> sb.append('\n') > case 'r' =3D> sb.append('\r') > case 't' =3D> sb.append('\t') > case 'Z' =3D> sb.append('\u001A') > case '\\' =3D> sb.append('\\') > // The following 2 lines are exactly what MySQL does TODO: why do= we do this? > case '%' =3D> sb.append("\\%") > case '_' =3D> sb.append("\\_") > case _ =3D> sb.append(n) > } > } > var i =3D 0 > val strLength =3D b.length > while (i < strLength) { > val currentChar =3D b.charAt(i) > if (enclosure =3D=3D null) { > if (currentChar =3D=3D '\'' || currentChar =3D=3D '\"') { > enclosure =3D currentChar > } > } else if (enclosure =3D=3D currentChar) { > enclosure =3D null > } else if (currentChar =3D=3D '\\') { > if ((i + 6 < strLength) && b.charAt(i + 1) =3D=3D 'u') { > // \u0000 style character literals. > val base =3D i + 2 > val code =3D (0 until 4).foldLeft(0) { (mid, j) =3D> > val digit =3D Character.digit(b.charAt(j + base), 16) > (mid << 4) + digit > } > sb.append(code.asInstanceOf[Char]) > i +=3D 5 > } else if (i + 4 < strLength) { > // \000 style character literals. > val i1 =3D b.charAt(i + 1) > val i2 =3D b.charAt(i + 2) > val i3 =3D b.charAt(i + 3) > if ((i1 >=3D '0' && i1 <=3D '1') && (i2 >=3D '0' && i2 <=3D '7'= ) && (i3 >=3D '0' && i3 <=3D '7')) { > val tmp =3D ((i3 - '0') + ((i2 - '0') << 3) + ((i1 - '0') << = 6)).asInstanceOf[Char] > sb.append(tmp) > i +=3D 3 > } else { > appendEscapedChar(i1) > i +=3D 1 > } > } else if (i + 2 < strLength) { > // escaped character literals. > val n =3D b.charAt(i + 1) > appendEscapedChar(n) > i +=3D 1 > } > } else { > // non-escaped character literals. > sb.append(currentChar) > } > i +=3D 1 > } > sb.toString() > } > {code} > Again, here, variable `b` is equal to content and `location`, is valued = of=20 > {code:java} > D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\a= vroDecimal > {code} > And we can make sense from the unescapeSQLString()' strategies that it tr= ansform the String "\t" into a escape character '\t' and remove all backsl= ashes. > So, our original correct location resulted in: > {code:java} > D:workspaceIdeaProjectssparksqlhive\targetscala-2.11\test-classesavroDeci= mal > {code} > after unescapeSQLString() completed. > Note that, here, [ \t ] is no longer a string, but a escape character.=20 > Then, return into SparkSqlParser#visitCreateHiveTable(), and move to L113= 4: > {code:java} > val locUri =3D location.map(CatalogUtils.stringToURI(_)) > {code} > `location` is passed to stringToURI(), and resulted in: > {code:java} > file:/D:workspaceIdeaProjectssparksqlhive%09argetscala-2.11%09est-classes= avroDecimal > {code} > finally, as escape character '\t' is transformed into URI code '%09'. > Although, I'm not clearly about how this wrong path directly caused that = exception, as I almostly know nothing about Hive, I can verify that this wr= ong path is the real factor to cause this exception. > When I append these lines(in order to fix the wrong path) after HiveExter= nalCatalog#doCreateTable()Line236-240: > {code:java} > if (tableLocation.get.getPath.startsWith("/D")) { > tableLocation =3D Some(CatalogUtils.stringToURI( > "file:/D:/workspace/IdeaProjects/spark/sql/hive/target/scala-2.11= /test-classes/avroDecimal")) > } > {code} > =20 > then, failed unit test A will pass, excluding test B. > And below is the stack trace of the Exception: > {code:java} > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:j= ava.lang.IllegalArgumentException: Can not create a Path from an empty stri= ng) > =09at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:602) > =09at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTabl= e$1.apply$mcV$sp(HiveClientImpl.scala:469) > =09at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTabl= e$1.apply(HiveClientImpl.scala:467) > =09at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTabl= e$1.apply(HiveClientImpl.scala:467) > =09at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveSt= ate$1.apply(HiveClientImpl.scala:273) > =09at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveC= lientImpl.scala:210) > =09at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveCli= entImpl.scala:209) > =09at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveC= lientImpl.scala:256) > =09at org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveCli= entImpl.scala:467) > =09at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTabl= e$1.apply$mcV$sp(HiveExternalCatalog.scala:263) > =09at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTabl= e$1.apply(HiveExternalCatalog.scala:216) > =09at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTabl= e$1.apply(HiveExternalCatalog.scala:216) > =09at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExtern= alCatalog.scala:97) > =09at org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExt= ernalCatalog.scala:216) > =09at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(E= xternalCatalog.scala:119) > =09at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(Se= ssionCatalog.scala:304) > =09at org.apache.spark.sql.execution.command.CreateTableCommand.run(table= s.scala:128) > =09at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffe= ctResult$lzycompute(commands.scala:70) > =09at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffe= ctResult(commands.scala:68) > =09at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeC= ollect(commands.scala:79) > =09at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186) > =09at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186) > =09at org.apache.spark.sql.Dataset$$anonfun$51.apply(Dataset.scala:3196) > =09at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQL= Execution.scala:77) > =09at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3195) > =09at org.apache.spark.sql.Dataset.(Dataset.scala:186) > =09at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:71) > =09at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638) > =09at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) > =09at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$= apply$24$$anonfun$apply$mcV$sp$3.apply$mcV$sp(VersionsSuite.scala:829) > =09at org.apache.spark.sql.hive.client.VersionsSuite.withTable(VersionsSu= ite.scala:70) > =09at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$= apply$24.apply$mcV$sp(VersionsSuite.scala:828) > =09at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$= apply$24.apply(VersionsSuite.scala:805) > =09at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$= apply$24.apply(VersionsSuite.scala:805) > =09at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > =09at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > =09at org.scalatest.Transformer.apply(Transformer.scala:22) > =09at org.scalatest.Transformer.apply(Transformer.scala:20) > =09at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) > =09at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68) > =09at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.s= cala:183) > =09at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.sc= ala:196) > =09at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.sc= ala:196) > =09at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) > =09at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196) > =09at org.scalatest.FunSuite.runTest(FunSuite.scala:1560) > =09at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.s= cala:229) > =09at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.s= cala:229) > =09at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engin= e.scala:396) > =09at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engin= e.scala:384) > =09at scala.collection.immutable.List.foreach(List.scala:381) > =09at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) > =09at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBran= ch(Engine.scala:379) > =09at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) > =09at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229) > =09at org.scalatest.FunSuite.runTests(FunSuite.scala:1560) > =09at org.scalatest.Suite$class.run(Suite.scala:1147) > =09at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSui= te.scala:1560) > =09at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:= 233) > =09at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:= 233) > =09at org.scalatest.SuperEngine.runImpl(Engine.scala:521) > =09at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233) > =09at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$sup= er$run(SparkFunSuite.scala:31) > =09at org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterA= ll.scala:213) > =09at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:2= 10) > =09at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:31) > =09at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45) > =09at org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(R= unner.scala:1340) > =09at org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(R= unner.scala:1334) > =09at scala.collection.immutable.List.foreach(List.scala:381) > =09at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1334= ) > =09at org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReport= er$2.apply(Runner.scala:1011) > =09at org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReport= er$2.apply(Runner.scala:1010) > =09at org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runn= er.scala:1500) > =09at org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runne= r.scala:1010) > =09at org.scalatest.tools.Runner$.run(Runner.scala:850) > =09at org.scalatest.tools.Runner.run(Runner.scala) > =09at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunne= r.runScalaTest2(ScalaTestRunner.java:138) > =09at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunne= r.main(ScalaTestRunner.java:28) > Caused by: MetaException(message:java.lang.IllegalArgumentException: Can = not create a Path from an empty string) > =09at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_ta= ble_with_environment_context(HiveMetaStore.java:1121) > =09at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > =09at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImp= l.java:62) > =09at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcc= essorImpl.java:43) > =09at java.lang.reflect.Method.invoke(Method.java:498) > =09at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(Retrying= HMSHandler.java:103) > =09at com.sun.proxy.$Proxy31.create_table_with_environment_context(Unknow= n Source) > =09at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(Hi= veMetaStoreClient.java:482) > =09at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(Hi= veMetaStoreClient.java:471) > =09at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > =09at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImp= l.java:62) > =09at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcc= essorImpl.java:43) > =09at java.lang.reflect.Method.invoke(Method.java:498) > =09at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(Ret= ryingMetaStoreClient.java:89) > =09at com.sun.proxy.$Proxy32.createTable(Unknown Source) > =09at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:596) > =09... 78 more > Caused by: java.lang.IllegalArgumentException: Can not create a Path from= an empty string > =09at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127) > =09at org.apache.hadoop.fs.Path.(Path.java:184) > =09at org.apache.hadoop.fs.Path.getParent(Path.java:357) > =09at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.j= ava:427) > =09at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.j= ava:690) > =09at org.apache.hadoop.hive.metastore.Warehouse.mkdirs(Warehouse.java:19= 4) > =09at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_ta= ble_core(HiveMetaStore.java:1059) > =09at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_ta= ble_with_environment_context(HiveMetaStore.java:1107) > =09... 93 more > {code} > As for test B, I did'n do a careful inspection, but I find a same wrong p= ath as test A. So, I guess exceptions were caused by the same factor. > =20 -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org