hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/Tutorial" by PrasadChakka
Date Tue, 10 Mar 2009 22:31:29 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by PrasadChakka:
http://wiki.apache.org/hadoop/Hive/Tutorial

------------------------------------------------------------------------------
      COMMENT 'This is the page view table' 
      PARTITIONED BY(dt STRING, country STRING) 
      ROW FORMAT DELIMITED
-             FIELDS TERMINATED BY '\001' 
+             FIELDS TERMINATED BY '1' 
-             LINES TERMINATED BY '\012' 
+             LINES TERMINATED BY '12' 
      STORED AS SEQUENCEFILE; 
  }}}
  
@@ -217, +217 @@

      PARTITIONED BY(dt STRING, country STRING) 
      CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS 
      ROW FORMAT DELIMITED
-             FIELDS TERMINATED BY '\001' 
+             FIELDS TERMINATED BY '1' 
-             COLLECTION ITEMS TERMINATED BY '\002' 
+             COLLECTION ITEMS TERMINATED BY '2' 
-             MAP KEYS TERMINATED BY '\003' 
+             MAP KEYS TERMINATED BY '3' 
-             LINES TERMINATED BY '\012' 
+             LINES TERMINATED BY '12' 
      STORED AS SEQUENCEFILE; 
  }}}
  In the example above, the table is bucketed(clustered by) userid and within each bucket
the data is sorted in the increasing order of viewTime. Such an organization allows the user
to do efficient sampling on the clustered column - in this case userid. The sorting property
allows internal operators to take advantage of the better-known data structure while evaluating
queries, also increasing efficiency.
@@ -233, +233 @@

      PARTITIONED BY(dt STRING, country STRING) 
      CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS 
      ROW FORMAT DELIMITED
-             FIELDS TERMINATED BY '\001' 
+             FIELDS TERMINATED BY '1' 
-             COLLECTION ITEMS TERMINATED BY '\002' 
+             COLLECTION ITEMS TERMINATED BY '2' 
-             MAP KEYS TERMINATED BY '\003' 
+             MAP KEYS TERMINATED BY '3' 
-             LINES TERMINATED BY '\012' 
+             LINES TERMINATED BY '12' 
      STORED AS SEQUENCEFILE; 
  }}}
  
@@ -290, +290 @@

                      ip STRING COMMENT 'IP Address of the User', 
                      country STRING COMMENT 'country of origination') 
      COMMENT 'This is the staging page view table' 
-     ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054' LINES TERMINATED BY '\012' 
+     ROW FORMAT DELIMITED FIELDS TERMINATED BY '54' LINES TERMINATED BY '12' 
      STORED AS TEXTFILE 
      LOCATION '/user/data/stagging/page_view'; 
  

Mime
View raw message