hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/LanguageManual/DDL" by JohnSichi
Date Sat, 10 Jul 2010 02:29:39 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/DDL" page has been changed by JohnSichi.


    [COMMENT table_comment]
    [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
    [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets
-   [ROW FORMAT row_format]
+   [
+    [ROW FORMAT row_format] [STORED AS file_format]
+    | STORED BY 'storage.handler.class.name' [ WITH SERDEPROPERTIES (...) ]  (Note:  only
available starting with 0.6.0)
+   ]
    [STORED AS file_format]
    [LOCATION hdfs_path]
-   [TBLPROPERTIES (property_name=property_value, ...)]  (Note:  only available on latest
trunk or versions higher than 0.5.0)
+   [TBLPROPERTIES (property_name=property_value, ...)]  (Note:  only available starting with
-   [AS select_statement]  (Note: this feature is only available on the latest trunk or versions
higher than 0.4.0.)
+   [AS select_statement]  (Note: this feature is only available starting with 0.5.0.)
    LIKE existing_table_name
@@ -58, +61 @@

    | INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname
  CREATE TABLE creates a table with the given name. An error is thrown if a table or view
with the same name already exists. You can use IF NOT EXISTS to skip the error.
  The EXTERNAL keyword lets you create a table and provide a LOCATION so that Hive does not
use a default location for this table. This comes in handy if you already have data generated.
When dropping an EXTERNAL table, data in the table is NOT deleted from the file system.
@@ -69, +73 @@

  You must specify a list of a columns for tables that use a native SerDe. Refer to the Types
part of the User Guide for the allowable column types. A list of columns for tables that use
a custom SerDe may be specified but Hive will query the SerDe to determine the actual list
of columns for this table.
  Use STORED AS TEXTFILE if the data needs to be stored as plain text files. Use STORED AS
SEQUENCEFILE if the data needs to be compressed. Please read more about [[Hive/CompressedStorage]]
if you are planning to keep data compressed in your Hive tables.  Use INPUTFORMAT and OUTPUTFORMAT
to specify the name of a corresponding InputFormat and OutputFormat class as a string literal,
e.g. 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
+ Use STORED BY to create a non-native table, for example in HBase.  See [[Hive/StorageHandlers]]
for more information on this option.
  Partitioned tables can be created using the PARTITIONED BY clause. A table can have one
or more partition columns and a separate data directory is created for each distinct value
combination in the partition columns. Further, tables or partitions can be bucketed using
CLUSTERED BY columns, and data can be sorted within that bucket via SORT BY columns. This
can improve performance on certain kinds of queries.

View raw message