hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <>
Subject [jira] [Commented] (HIVE-17359) Deal with TypeInfo dependencies in the metastore
Date Sat, 19 Aug 2017 00:07:00 GMT


Alan Gates commented on HIVE-17359:

Unfortunately there's a three way circular dependency between TypeInfo, SerDe, and ObjectInspector.
 So the solution to this will need to account for what happens to all three of those classes.

The metastore uses TypeInfo in 3 ways and SerDe in one:
# the type names in
# the type groupings (e.g. string types, numeric types, ...), also from serdeConstants
# the allowed column type transitions in HiveAlterHandler uses TypeInfoUtils.implicitConvertible
to determine if an "alter table change column type" is legal
# HiveMetaStore.get_fields_with_environment_context uses the serde to determine the schema
for tables where the schema is defined in the storage rather than in the metadata (e.g. Avro).

For the purpose of this JIRA I'm only resolving the TypeInfo issues.  We'll solve the SerDe
issue later, though obviously the choice we make here will affect that case.

I see three possible solutions:
# Move the serde package into storage-api.  This would allow the standalone metastore (as
well as ORC, Parquet, others) to depend on it.  The smaller we keep the storage-api the better.
 This would bring a lot of code and dependencies into it.  Thus I see this as an option of
last resort.
# Untangle the TypeInfo, SerDe, ObjectInspector dependency triangle and then put TypeInfo
into the metastore.  Clean, non-circular dependencies are nice.  And having type definitions
in the metadata makes sense.  But since this would change the SerDe and ObjectInspector interfaces
it would break every existing serde and OI.  I take that to be a non-starter.
#  Duplicate just the needed pieces of TypeInfo in the metastore.  This turns out to be a
couple hundred lines of code.  Given the stringent backward compatibility needs, the odds
of type names, type groupings, or alter table semantics changing (with the exception of adding
new types) seem very low.  The downside to this will come in adding new type names, which
will required changes in hive-serde and the standalone metastore.  On the upside it allows
the metastore to develop types that Hive might not care about.  I propose to take this option.

> Deal with TypeInfo dependencies in the metastore
> ------------------------------------------------
>                 Key: HIVE-17359
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Metastore
>    Affects Versions: 3.0.0
>            Reporter: Alan Gates
>            Assignee: Alan Gates
> The metastore uses TypeInfo, which resides in the serdes package.  In order to move the
metastore to be separately releasable we need to deal with this.

This message was sent by Atlassian JIRA

View raw message