spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Lian <lian.cs....@gmail.com>
Subject Re: SparkSQL: CREATE EXTERNAL TABLE with a SchemaRDD
Date Wed, 24 Dec 2014 08:30:38 GMT
Hao and Lam - I think the issue here is that |registerRDDAsTable| only 
creates a temporary table, which is not seen by Hive metastore.

And Michael had once given a workaround for creating external Parquet 
table: 
http://apache-spark-user-list.1001560.n3.nabble.com/persist-table-schema-in-spark-sql-td16297.html

Cheng

On 12/24/14 9:38 AM, Cheng, Hao wrote:

> Hi, Lam, I can confirm this is a bug with the latest master, and I 
> filed a jira issue for this:
>
> https://issues.apache.org/jira/browse/SPARK-4944
>
> Hope come with a solution soon.
>
> Cheng Hao
>
> *From:*Jerry Lam [mailto:chilinglam@gmail.com]
> *Sent:* Wednesday, December 24, 2014 4:26 AM
> *To:* user@spark.apache.org
> *Subject:* SparkSQL: CREATE EXTERNAL TABLE with a SchemaRDD
>
> Hi spark users,
>
> I'm trying to create external table using HiveContext after creating a 
> schemaRDD and saving the RDD into a parquet file on hdfs.
>
> I would like to use the schema in the schemaRDD (rdd_table) when I 
> create the external table.
>
> For example:
>
> rdd_table.saveAsParquetFile("/user/spark/my_data.parquet")
>
> hiveContext.registerRDDAsTable(rdd_table, "rdd_table")
>
> hiveContext.sql("CREATE EXTERNAL TABLE my_data LIKE rdd_table LOCATION 
> '/user/spark/my_data.parquet'")
>
> the last line fails with:
>
> org.apache.spark.sql.execution.QueryExecutionException: FAILED: 
> Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Table not found rdd_table
>
>             at 
> org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:322)
>
>             at 
> org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:284)
>
>             at 
> org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35)
>
>             at 
> org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35)
>
>             at 
> org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:38)
>
>             at 
> org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:382)
>
>             at 
> org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:382)
>
> Is this supported?
>
> Best Regards,
>
> Jerry
>
‚Äč

Mime
View raw message