hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/LanguageManual/DML" by Ning Zhang
Date Thu, 15 Apr 2010 16:50:57 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/DML" page has been changed by Ning Zhang.
http://wiki.apache.org/hadoop/Hive/LanguageManual/DML?action=diff&rev1=14&rev2=15

--------------------------------------------------

  FROM from_statement
  INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1
  [INSERT OVERWRITE TABLE tablename2 [PARTITION ...] select_statement2] ...
+ 
+ Hive extension (dynamic partition inserts):
+ INSERT OVERWRITE TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement
FROM from_statement 
  }}}
  
  ===== Synopsis =====
@@ -60, +63 @@

   * Multiple insert clauses (also known as ''Multi Table Insert'') can be specified in the
same query
   * The output of each of the select statements is written to the chosen table (or partition).
Currently the OVERWRITE keyword is mandatory and implies that the contents of the chosen table
or partition are replaced with the output of corresponding select statement.
   * The output format and serialization class is determined by the table's metadata (as specified
via DDL commands on the table)
+  * In the dynamic partition inserts, users can give partial partition specification, which
means you just specify the list of partition column names in the PARTITION clause. The column
values are optional. If a partition column value is given, we call this static partition,
otherwise dynamic partition. Each dynamic partition column has a corresponding input column
from the select statement. This means that the dynamic partition creation is determined by
the value of the input column. 
  
  ===== Notes =====
   * Multi Table Inserts minimize the number of data scans required. Hive can insert data
into multiple tables by scanning the input data just once (and applying different query operators)
to the input data.

Mime
View raw message