hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/LanguageManual/DDL" by Ning Zhang
Date Fri, 09 Oct 2009 23:48:55 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/DDL" page has been changed by Ning Zhang:
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL?action=diff&rev1=18&rev2=19

  
  Table names and column names are case insensitive but SerDe and property names are case
sensitive.
  
- Tables can also be created and populated by the results of a query in one CTAS (create-table-as-select)
statement. There are two parts in CTAS, the select part can be any select statement supported
by HiveQL. The create part of the CTAS takes the result schema (column names are aliases in
the select clause and data types are derived from the select expressions) from the select
part and create the target table with other properties (such as SerDe and storage format).
The only restrictions in the CTAS is that the target table cannot be a partitioned table nor
an external table. In addition, the table created by CTAS is atomic, meaning that the table
is not seen by other users until all the result of the SELECT part is finished and populated.
So other users will either see the table with the total results or will not see the table
at all.
+ Tables can also be created and populated by the results of a query in one CTAS (create-table-as-select)
statement. The table created by CTAS is atomic, meaning that the table is not seen by other
users until all the query results are populated. So other users will either see the table
with the complete results of the query or will not see the table at all.
+ 
+ There are two parts in CTAS, the SELECT part can be any [[Hive/LanguageManual/Select|SELECT
statement]] supported by HiveQL. The CREATE part of the CTAS takes the resulting schema from
the SELECT part and create the target table with other table properties such as the SerDe
and storage format. The only restrictions in CTAS is that the target table cannot be a partitioned
table nor an external table. 
  
  Examples:
  
@@ -132, +134 @@

  SORT BY new_key, key_value_pair;
  }}}
  
- The above CTAS statement create the target table new_key_value_store with the schema derived
from the results of the SELECT statement. So the schema of the table new_key_value_store will
be (new_key DOUBLE, key_value_pair STRING). In addition, the new target table is using a specific
SerDe and a storage format independent of the source tables in the SELECT statement. 
+ The above CTAS statement creates the target table new_key_value_store with the schema, (new_key
DOUBLE, key_value_pair STRING), derived from the results of the SELECT statement. If the SELECT
statement does not specify column aliases, the column names will be automatically assigned
to _col0, _col1, and _col2 etc. In addition, the new target table is created using a specific
SerDe and a storage format independent of the source tables in the SELECT statement. 
  
  ==== Inserting Data Into Bucketed Tables ====
  The CLUSTER BY and SORTED BY creation commands do not effect how data is inserted into a
table -- only how it is read.  This means that users must actively insert data correctly by
specifying the number of reducers to be equal to the number of buckets, and using CLUSTER
BY and SORT BY commands in their query.

Mime
View raw message