hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/LanguageManual/DDL" by JohnSichi
Date Tue, 29 Dec 2009 02:03:21 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/DDL" page has been changed by JohnSichi.
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL?action=diff&rev1=28&rev2=29

--------------------------------------------------

  file_format:
    : SEQUENCEFILE
    | TEXTFILE
+   | INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname
  }}}
  CREATE TABLE creates a table with given name. An error is thrown if a table with the same
name exists. You can use IF NOT EXISTS to skip the error.
  
@@ -58, +59 @@

  
  You must specify list of columns for tables with native SerDe. Refer to Types part of the
User Guide for the allowable column types. List of columns for tables with custom SerDe may
be specified but Hive will query the SerDe to determine the list of columns for this table.
  
- Use STORED AS TEXTFILE if the data needs to be stored as plain text files. Use STORED AS
SEQUENCEFILE if the data needs to be compressed. Please read more about CompressedStorage
if you are planning to keep data compressed in your Hive tables.
+ Use STORED AS TEXTFILE if the data needs to be stored as plain text files. Use STORED AS
SEQUENCEFILE if the data needs to be compressed. Please read more about CompressedStorage
if you are planning to keep data compressed in your Hive tables.  Use INPUTFORMAT and OUTPUTFORMAT
to specify the name of a corresponding InputFormat/OutputFormat class as a string literal,
e.g. 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
  
  Partitioned tables can be created using PARTIONED BY clause. A table can have one or more
partition columns and a separate data directory is created for each set of partition columns
values. Further tables or partitions can be bucketed using CLUSTERD BY columns and data can
be sorted with in that bucket by SORT BY columns. This can improve performance on certain
kind of queries.
  

Mime
View raw message