hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Maughan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-15434) Add UDF to allow interrogation of uniontype values
Date Thu, 15 Dec 2016 12:43:59 GMT

     [ https://issues.apache.org/jira/browse/HIVE-15434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

David Maughan updated HIVE-15434:
---------------------------------
        Assignee: David Maughan
    Release Note: Added UDF to allow interrogation of uniontype values.
          Status: Patch Available  (was: Open)

> Add UDF to allow interrogation of uniontype values
> --------------------------------------------------
>
>                 Key: HIVE-15434
>                 URL: https://issues.apache.org/jira/browse/HIVE-15434
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>    Affects Versions: 2.1.1
>            Reporter: David Maughan
>            Assignee: David Maughan
>
> h2. Overview
> As stated in the documention:
> {quote}
> UNIONTYPE support is incomplete The UNIONTYPE datatype was introduced in Hive 0.7.0 (HIVE-537),
but full support for this type in Hive remains incomplete. Queries that reference UNIONTYPE
fields in JOIN (HIVE-2508), WHERE, and GROUP BY clauses will fail, and Hive does not define
syntax to extract the tag or value fields of a UNIONTYPE. This means that UNIONTYPEs are effectively
look-at-only.
> {quote}
> It is essential to have a usable uniontype. Until full support is added to Hive users
should at least have the ability to inspect and extract values for further comparison or transformation.
> h2. Proposal
> I propose to add a GenericUDF that has 2 modes of operation. Consider the following schema
and data that contains a union:
> Schema:
> {code}
> struct<field1:uniontype<int,string>>
> {code}
> Query:
> {code}
> hive> select field1 from thing;
> {0:0}
> {1:"one"}
> {code}
> h4. Explode to Struct
> This method will recursively convert all unions within the type to structs with fields
named {{tag_n}}, {{n}} being the tag number. Only the {{tag_*}} field that matches the tag
of the union will be populated with the value. In the case above the schema of field1 will
be converted to:
> {code}
> struct<tag_0:int,tag_1:string>
> {code}
> {code}
> hive> select extract_union(field1) from thing;
> {"tag_0":0,"tag_1":null}
> {"tag_0":null,"tag_1":one}
> {code}
> {code}
> hive> select extract_union(field1).tag_0 from thing;
> 0
> null
> {code}
> h4. Extract the specified tag
> This method will simply extract the value of the specified tag. If the tag number matches
then the value is returned, if it does not, then null is returned.
> {code}
> hive> select extract_union(field1, 0) from thing;
> 0
> null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message