hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Weaver (JIRA)" <>
Subject [jira] [Commented] (HIVE-784) Support uncorrelated subqueries in the WHERE clause
Date Mon, 10 Jun 2013 21:08:21 GMT


Matthew Weaver commented on HIVE-784:

I'm looking into uncorrelated subqueries.  Will pick this up unless someone else is actively
working it.

h5. Functional Requirements

* Support {{WHERE x IN (<column subquery>);}}
** {{<column subquery>}} returns one column, any number of rows.
* Support {{WHERE x NOT IN (<column subquery>)}};
* Support same types of subqueries in {{HAVING}}.
** E.g.
HAVING COUNT(value) IN (SELECT p FROM t2);  {code}
* Correlated subqueries not supported, for now at least
** But still need to check for correlation, and bail if it occurs.
** Correlated subquery:
*** A subquery that references a table that appears in a containing query ([MySQL|]),
thus requiring subquery evaluation to look outside its scope.
*** The subquery depends on the outer query for its values, so the subquery must be executed
once for each row of the outer query.  Also known as _repeating Subqueries_.

h5. Tasks
* Rewrite {{IN (<column-subquery>)}} as a {{LEFT SEMI JOIN}}.
** Not ready for public consumption.  In particular, no check for correlated terms.
** With test queries.
* Add check for correlated terms, return informative error message.
* Rewrite {{WHERE NOT IN (<column-subquery>)}} as a {{LEFT OUTER JOIN}}.
** Return rows that don't match the right side
* Rewrite subqueries in {{HAVING}}, using {{LEFT SEMI JOIN}} and {{LEFT OUTER JOIN}} as above.
> Support uncorrelated subqueries in the WHERE clause
> ---------------------------------------------------
>                 Key: HIVE-784
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Ning Zhang
>            Assignee: Matthew Weaver
> Hive currently only support views in the FROM-clause, some Facebook use cases suggest
that Hive should support subqueries such as those connected by IN/EXISTS in the WHERE-clause.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message