hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prasad Chakka <pra...@facebook.com>
Subject Re: Questions regarding Hive metadata schema
Date Tue, 07 Oct 2008 22:49:50 GMT
Hi Alan,

The objects are very closely associated with the Thrift API objects defined
in src/contrib/hive/metastore/if/hive_metastore.thrift . It contains
descriptions as to what each field is and it should most of your questions.
ORM for this is at s/c/h/metastore/src/java/model/package.jdo.

2) SD is storage descriptor (look at SDS table)
3) SERDES contains information for Hive serializers and deserializers
5) Tables and Partitions have Storage Descriptors. Storage Descriptors
contain physical storage info and how to read the data (serde info). Storage
Description object actually contains the columns. This means that different
partitions can have different column sets
6) 1-1


From: Alan Gates <gates@yahoo-inc.com>
Reply-To: <core-user@hadoop.apache.org>
Date: Tue, 7 Oct 2008 15:28:50 -0700
To: <core-user@hadoop.apache.org>
Subject: Questions regarding Hive metadata schema


I've been looking over the db schema that hive uses to store it's
metadata (package.jdo) and I had some questions:

   1.  What do the field names in the TYPES table mean? TYPE1, TYPE2,
and TYPE_FIELDS are all unclear to me.
   2. In the TBLS (tables) table, what is sd?
   3. What does the SERDES table store?
   4. What does the SORT_ORDER table store? It appears to describe the
ordering within a storage descriptor, which in turn appears to be
related to a partition. Do you envision having a table where different
partitions have different orders?
   5. SDS (storage descriptor) table has a list of columns. Does this
imply that columnar storage is supported?
   6. What is the relationship between a storage descriptor and a
partition? 1-1, 1-n?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message