spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrei Badea (JIRA)" <>
Subject [jira] [Updated] (SPARK-23641) Wrong username when making relative path to Hive LOAD DATA absolute
Date Fri, 09 Mar 2018 12:33:00 GMT


Andrei Badea updated SPARK-23641:
    Priority: Minor  (was: Major)

> Wrong username when making relative path to Hive LOAD DATA absolute
> -------------------------------------------------------------------
>                 Key: SPARK-23641
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Andrei Badea
>            Priority: Minor
> We have an application deployed in yarn-cluster mode.
> At some point, the application invokes
> {noformat}
> spark.sql("LOAD DATA INPATH some/relative/path ...")
> {noformat}
> in an attempt to add that directory to a Hive table. The relative path should be interpreted
relatively to the home directory of the user who ran the Spark application (this is what the
Hive shell does).
> The command runs without failing, but the directory is not added to the table. Investigation
showed that {{org.apache.spark.sql.execution.command.LoadDataCommand}} attempts to make the
path absolute by prepending {{s"/user/${System.getProperty("")}"}}. Since the application
was deployed in yarn-cluster mode, the value of the {{}} property is "yarn". This
is illustrated by the following message in the driver logs:
> {noformat}
> INFO metadata.Hive: No sources specified to move: hdfs://namenode:8020/user/yarn/some/relative/path{noformat}
> Interestingly, the same Spark application writes the data to the relative path (prior
to calling LOAD DATA), and that makes the path absolute as expected. It uses {{Path.makeQualified()}},
which makes the path relative against {{FileSystem.getWorkingDirectory}}, which by default
is {{FileSystem.getHomeDirectory}} (and that apparently initializes early enough – on the
machine on which the application is submitted).

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message