spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <>
Subject [jira] [Assigned] (SPARK-18752) "isSrcLocal" parameter to Hive loadTable / loadPartition should come from user
Date Wed, 07 Dec 2016 00:32:58 GMT


Apache Spark reassigned SPARK-18752:

    Assignee: Apache Spark

> "isSrcLocal" parameter to Hive loadTable / loadPartition should come from user
> ------------------------------------------------------------------------------
>                 Key: SPARK-18752
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Marcelo Vanzin
>            Assignee: Apache Spark
>            Priority: Minor
> We ran into an issue with the HiveShim code that calls "loadTable" and "loadPartition"
while testing with some recent changes in upstream Hive.
> The semantics in Hive changed slightly, and if you provide the wrong value for "isSrcLocal"
you now can end up with an invalid table: the Hive code will move the temp directory to the
final destination instead of moving its children.
> The problem in Spark is that HiveShim.scala tries to figure out the value of "isSrcLocal"
based on where the source and target directories are; that's not correct. "isSrcLocal" should
be set based on the user query (e.g. "LOAD DATA LOCAL" would set it to "true"). So we need
to propagate that information from the user query down to HiveShim.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message