spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ratandeep Ratti (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-20516) Spark SQL documentation out of date?
Date Fri, 28 Apr 2017 15:28:04 GMT

    [ https://issues.apache.org/jira/browse/SPARK-20516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988690#comment-15988690
] 

Ratandeep Ratti edited comment on SPARK-20516 at 4/28/17 3:27 PM:
------------------------------------------------------------------

I'm unable to reproduce both these issues from spark shell. When run from within my IDE I
need to specify the master as you mentioned, also for  {{warehouseLoc}} this is the exception
I get.

{noformat}
Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path
in absolute URI: file:./spark-warehouse
	at org.apache.hadoop.fs.Path.initialize(Path.java:206)
	at org.apache.hadoop.fs.Path.<init>(Path.java:197)
	at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:141)
	at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:146)
	at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
	at org.apache.hadoop.hive.metastore.Warehouse.getDatabasePath(Warehouse.java:170)
	at org.apache.hadoop.hive.metastore.Warehouse.getTablePath(Warehouse.java:184)
	at org.apache.hadoop.hive.metastore.Warehouse.getFileStatusesForUnpartitionedTable(Warehouse.java:520)
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.updateUnpartitionedTableStatsFast(MetaStoreUtils.java:179)
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.updateUnpartitionedTableStatsFast(MetaStoreUtils.java:174)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1403)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1449)
	... 49 more
Caused by: java.net.URISyntaxException: Relative path in absolute URI: file:./spark-warehouse
	at java.net.URI.checkPath(URI.java:1823)
	at java.net.URI.<init>(URI.java:745)
	at org.apache.hadoop.fs.Path.initialize(Path.java:203)
	... 60 more
{noformat}

Below is the code I ran from my IDE which produces the exception
{code}
import org.apache.spark.sql.SparkSession


object SPARK_20516 {
  def warehouseLoc: Unit = {
    val warehouseLocation = "spark-warehouse"
    val spark = SparkSession
            .builder()
            .master("local")
            .appName("Spark Hive Example")
            .config("spark.sql.warehouse.dir", warehouseLocation)
            .enableHiveSupport()
            .getOrCreate()

    import spark.sql
    sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)")
  }

  def main(args: Array[String]): Unit = {
    warehouseLoc
  }
}
{code}

I think the {{warehouseDir}} should behave the same whether it is shell or IDE/spark-submit
. 
I also think not specifying {{master}} should be consistent when running from Spark-shell
or from within IDE?
Would love to hear you thoughts on this.



was (Author: rdsr):
I'm unable to reproduce both these issues from spark shell. When run from within my IDE I
need to specify the master as you mentioned, also for  {{warehouseLoc}} this is the exception
I get.

{noformat}
Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path
in absolute URI: file:./spark-warehouse
	at org.apache.hadoop.fs.Path.initialize(Path.java:206)
	at org.apache.hadoop.fs.Path.<init>(Path.java:197)
	at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:141)
	at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:146)
	at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
	at org.apache.hadoop.hive.metastore.Warehouse.getDatabasePath(Warehouse.java:170)
	at org.apache.hadoop.hive.metastore.Warehouse.getTablePath(Warehouse.java:184)
	at org.apache.hadoop.hive.metastore.Warehouse.getFileStatusesForUnpartitionedTable(Warehouse.java:520)
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.updateUnpartitionedTableStatsFast(MetaStoreUtils.java:179)
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.updateUnpartitionedTableStatsFast(MetaStoreUtils.java:174)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1403)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1449)
	... 49 more
Caused by: java.net.URISyntaxException: Relative path in absolute URI: file:./spark-warehouse
	at java.net.URI.checkPath(URI.java:1823)
	at java.net.URI.<init>(URI.java:745)
	at org.apache.hadoop.fs.Path.initialize(Path.java:203)
	... 60 more
{noformat}

Below is the code I ran from my IDE which produces the exception
{code}
import org.apache.spark.sql.SparkSession


object SPARK_20516 {
  def warehouseLoc: Unit = {
    val warehouseLocation = "spark-warehouse"
    val spark = SparkSession
            .builder()
            .master("local")
            .appName("Spark Hive Example")
            .config("spark.sql.warehouse.dir", warehouseLocation)
            .enableHiveSupport()
            .getOrCreate()

    import spark.sql
    sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)")
  }

  def main(args: Array[String]): Unit = {
    warehouseLoc
  }
}
{code}

I think the {{warehouseDir}} should behave the same whether it is shell or IDE/spark-submit
. 
I also think not specifying {{master}} should be consistent when running from Spark-shell
or from within IDE?
Would love you hear you thoughts on this.


> Spark SQL documentation out of date?
> ------------------------------------
>
>                 Key: SPARK-20516
>                 URL: https://issues.apache.org/jira/browse/SPARK-20516
>             Project: Spark
>          Issue Type: Task
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Ratandeep Ratti
>            Priority: Minor
>         Attachments: spark-20516.zip
>
>
> I was trying out the examples on the [Spark Sql page|https://spark.apache.org/docs/2.1.0/sql-programming-guide.html].
It seems that now we have to specify invoke {{master()}} on the SparkSession builder and also
warehouseLocation is now a URI.
> I can fix the documentation (sql-programming-guide.html) and send a PR request.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message