spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dciborow (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-15005) Usage of Temp Table twice in Hive query fails with bad error
Date Fri, 29 Apr 2016 16:54:12 GMT
dciborow created SPARK-15005:
--------------------------------

             Summary: Usage of Temp Table twice in Hive query fails with bad error
                 Key: SPARK-15005
                 URL: https://issues.apache.org/jira/browse/SPARK-15005
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.6.0
            Reporter: dciborow
            Priority: Minor


When converting a Hive ETL process from Hive to Spark, adjustments might be made to the query.
One adjustment is that the Hive query might query from the same table more then once in an
join the results together. When Spark tries to process this query it provides an very poor
error message, that does not help the user determine what has gone wrong. It should be simple
to detect this, and properly report it to the user. 

Sample Query that contains the error(edited for this post so might not run)

SELECT
|            enc.id
|            enc.name,
|            enc.sum
|            FROM
|            (
|                SELECT
|                    *
|                FROM
|                    table1
|        JOIN
|            (
|                SELECT
|                    id,
|                    SUM(impressions) AS
|                    sum_impressions,
|                FROM
|                    table1 enc
|                GROUP BY
|                    enc.id) enc1
|        ON
|            (
|                enc.id = enc1.id)


Error Message(had to edit to remove a bunch of field names, but tried to leave everything
I could)

16/04/28 15:47:09 INFO ParseDriver: Parse Completed
org.apache.spark.sql.AnalysisException: resolved attribute(s) [_id#3372,], [HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFSum(unique_audience#3380)
windowspecdefinition(id#3372,ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS
_we0#530,HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFSum(total_impressions#3382)
windowspecdefinition(id#3372,,ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS
_we1#531], [id#3372,];





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message