hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Weaver (JIRA)" <>
Subject [jira] [Created] (HIVE-4697) Subqueries with IN and NOT IN
Date Mon, 10 Jun 2013 20:12:20 GMT
Matthew Weaver created HIVE-4697:

             Summary: Subqueries with IN and NOT IN
                 Key: HIVE-4697
             Project: Hive
          Issue Type: New Feature
          Components: Query Processor
            Reporter: Matthew Weaver
            Assignee: Matthew Weaver

h5. Functional Requirements

* Support {{WHERE x IN (<column subquery>);}}
** {{<column subquery>}} returns one column, any number of rows.
* Support {{WHERE x NOT IN (<column subquery>)}};
* Support same types of subqueries in {{HAVING}}.
** E.g.
HAVING COUNT(value) IN (SELECT p FROM t2);  {code}
* Correlated subqueries not supported, for now at least
** But still need to check for correlation, and bail if it occurs.
** Correlated subquery:
*** A subquery that references a table that appears in a containing query ([MySQL|]),
thus requiring subquery evaluation to look outside its scope.
*** The subquery depends on the outer query for its values, so the subquery must be executed
once for each row of the outer query.  Also known as _repeating Subqueries_.

h5. Tasks
* Rewrite {{IN (<column-subquery>)}} as a {{LEFT SEMI JOIN}}.
** Not ready for public consumption.  In particular, no check for correlated terms.
** With test queries.
* Add check for correlated terms, return informative error message.
* Rewrite {{WHERE NOT IN (<column-subquery>)}} as a {{LEFT OUTER JOIN}}.
** Return rows that don't match the right side
* Rewrite subqueries in {{HAVING}}, using {{LEFT SEMI JOIN}} and {{LEFT OUTER JOIN}} as above.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message