hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <>
Subject [jira] [Updated] (HIVE-6265) dedup Metastore data structures or at least protocol
Date Wed, 22 Jan 2014 22:50:19 GMT


Sergey Shelukhin updated HIVE-6265:

    Component/s: Metastore

> dedup Metastore data structures or at least protocol
> ----------------------------------------------------
>                 Key: HIVE-6265
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Sergey Shelukhin
> Metastore currently stores SD per partition, and column schema/serde/... per SD.
> Most of the time all the partitions have the same setup in a table, the only different
things in SD/CD/... being the location. In such cases, we don't need to store these separately
and send them to client when many partitions are retrieved for a large table. While storage
changes may be too complex wrt backward compat, as well as with DataNucleus being in the picture
and controlling the db schema/persistence, at least we can avoid sending lots of duplicate
data to the client on the network; thrift protocol can be modified to omit duplicate data
in a backward compatible manner.

This message was sent by Atlassian JIRA

View raw message