pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "PigUserCookbook" by OlgaN
Date Fri, 13 Feb 2009 19:11:39 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The following page has been changed by OlgaN:
http://wiki.apache.org/pig/PigUserCookbook

------------------------------------------------------------------------------
  
  One case where pushing filters up might not be a good idea is if the cost of applying filter
is very high and only a small amount of data is filtered out.
  
+ '''Reduce Your Operator Pipeline'''
+ 
+ For clarity of your script, you might choose to split your projects into several steps for
instance:
+ 
+ {{{
+ A = load 'data' as (in: map[]);
+ -- get key out of the map
+ B = foreach A generate in#k1 as k1, in#k2 as k2;
+ -- concatenate the keys
+ C = foreach B generate CONCAT(k1, k2);
+ .......
+ }}}
+ 
+ While the example above is easier to read, you might want to consider combining the two
foreach statements to improve your query performance:
+ 
+ {{{
+ A = load 'data' as (in: map[]);
+ -- concatenate the keys from the map
+ B = foreach A generate CONCAT(in#k1, in#k2);
+ ....
+ }}}
+ 
+ The same goes for filters.
+ 
  '''Drop Nulls Before a Join'''
  
  This comment only applies to pig on the types branch, as pig 0.1.0 does not have nulls.

Mime
View raw message