hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <>
Subject [jira] Commented: (HIVE-655) Add support for user defined table generating functions
Date Mon, 17 Aug 2009 00:06:14 GMT


Zheng Shao commented on HIVE-655:

Another way is like this:

1. Simplest example: just use the same syntax as UDF.
  SELECT pageid, EXPLODE(adid_list) as adid
  FROM mytable;

2. If the UDTF produces more than 1 columns, then we have 3 options:
A. Simplest way: needs common sub expression elimination to achieve good performance
  SELECT pageid, EXPLODE(ad_list).adid AS adid, EXPLODE(ad_list).adtext AS adtext
  FROM mytable;

B. Simplify the query using sub query:
  SELECT pageid, ad.adid AS adid, ad.adtext AS adtext
  FROM (SELECT pageid, EXPLODE(ad_list) AS ad
       FROM mytable) a;

C. Expand the structure inline:
  SELECT pageid, EXPLODE(ad_list) as (adid, adtext)
  FROM mytable;

Hive already have support for B. For A, we need to do the common sub expression, but I guess
we want to do it anyway.
C seems a nice extension but it is not limited to UDTF - UDF/UDAF should support the same
thing, if we want to support this.

3. Parallel UDTF calls means cross product:
  SELECT pageid, EXPLODE(adid_list) AS adid, EXPLODE(link_list) AS link
  FROM mytable;

> Add support for user defined table generating functions
> -------------------------------------------------------
>                 Key: HIVE-655
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Raghotham Murthy
>            Assignee: Raghotham Murthy
> Provide a way for users to add a table generating function, i.e., functions that generate
multiple rows from a single input row. Currently, the only way to do it is via the TRANSFORM
clause which requires streaming the data.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message