hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hive/LanguageManual/DDL" by ZhengShao
Date Wed, 17 Feb 2010 21:53:12 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/DDL" page has been changed by ZhengShao.
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL?action=diff&rev1=49&rev2=50

--------------------------------------------------

  
  You must specify list of a columns for tables with native SerDe. Refer to the Types part
of the User Guide for the allowable column types. A list of columns for tables with custom
SerDe may be specified but Hive will query the SerDe to determine the actual list of columns
for this table.
  
- Use STORED AS TEXTFILE if the data needs to be stored as plain text files. Use STORED AS
SEQUENCEFILE if the data needs to be compressed. Please read more about CompressedStorage
if you are planning to keep data compressed in your Hive tables.  Use INPUTFORMAT and OUTPUTFORMAT
to specify the name of a corresponding InputFormat and OutputFormat class as a string literal,
e.g. 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
+ Use STORED AS TEXTFILE if the data needs to be stored as plain text files. Use STORED AS
SEQUENCEFILE if the data needs to be compressed. Please read more about [[Hive/CompressedStorage]]
if you are planning to keep data compressed in your Hive tables.  Use INPUTFORMAT and OUTPUTFORMAT
to specify the name of a corresponding InputFormat and OutputFormat class as a string literal,
e.g. 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
  
  Partitioned tables can be created using the PARTITIONED BY clause. A table can have one
or more partition columns and a separate data directory is created for each distinct value
combination in the partition columns. Further, tables or partitions can be bucketed using
CLUSTERED BY columns, and data can be sorted within that bucket via SORT BY columns. This
can improve performance on certain kinds of queries.
  

Mime
View raw message