hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-2941) Hive should expand nested structs when setting the table schema from thrift structs
Date Tue, 10 Apr 2012 19:53:13 GMT

     [ https://issues.apache.org/jira/browse/HIVE-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Phabricator updated HIVE-2941:
------------------------------

    Attachment: HIVE-2941.D2721.1.patch

travis requested code review of "HIVE-2941 [jira] Hive should expand nested structs when setting
the table schema from thrift structs".
Reviewers: JIRA

  Update ReflectionStructObjectInspector, when returning its type, to return an expanded struct
containing fields and their types.

  When setting a table serde, the deserializer is queried for its schema, which is used to
set the metastore table schema. The current implementation uses the class name stored in the
field as the field type.

  By storing the class name as the field type, users cannot see the contents of a struct with
"describe tblname". Applications that query HiveMetaStore for the table schema (specifically
HCatalog in this case) see an unknown field type, rather than a struct containing known field
types.

  Hive should store the expanded schema in the metastore so users browsing the schema see
expanded fields, and applications querying metastore see familiar types.

  DETAILS

  Set the table serde to something like this. This serde uses the built-in ThriftStructObjectInspector.

  alter table foo_test
    set serde "com.twitter.elephantbird.hive.serde.ThriftSerDe"
    with serdeproperties ("serialization.class"="com.foo.Foo");

  This causes a call to MetaStoreUtils.getFieldsFromDeserializer which returns a list of fields
and their schemas. However, currently it does not handle nested structs, and if com.foo.Foo
above contains a field com.foo.Bar, the class name com.foo.Bar would appear as the field type.
Instead, nested structs should be expanded.

TEST PLAN
  Manually verified table schema is set correctly. Can improve testing after getting feedback
on this approach.

REVISION DETAIL
  https://reviews.facebook.net/D2721

AFFECTED FILES
  serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ReflectionStructObjectInspector.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/6213/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.

                
> Hive should expand nested structs when setting the table schema from thrift structs
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-2941
>                 URL: https://issues.apache.org/jira/browse/HIVE-2941
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>         Attachments: HIVE-2941.D2721.1.patch
>
>
> When setting a table serde, the deserializer is queried for its schema, which is used
to set the metastore table schema. The current implementation uses the class name stored in
the field as the field type.
> By storing the class name as the field type, users cannot see the contents of a struct
with "describe tblname". Applications that query HiveMetaStore for the table schema (specifically
HCatalog in this case) see an unknown field type, rather than a struct containing known field
types.
> Hive should store the expanded schema in the metastore so users browsing the schema see
expanded fields, and applications querying metastore see familiar types.
> DETAILS
> Set the table serde to something like this. This serde uses the built-in {{ThriftStructObjectInspector}}.
> {code}
> alter table foo_test
>   set serde "com.twitter.elephantbird.hive.serde.ThriftSerDe"
>   with serdeproperties ("serialization.class"="com.foo.Foo");
> {code}
> This causes a call to {{MetaStoreUtils.getFieldsFromDeserializer}} which returns a list
of fields and their schemas. However, currently it does not handle nested structs, and if
{{com.foo.Foo}} above contains a field {{com.foo.Bar}}, the class name {{com.foo.Bar}} would
appear as the field type. Instead, nested structs should be expanded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message