pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "PigUserCookbook" by OlgaN
Date Fri, 13 Feb 2009 19:22:47 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The following page has been changed by OlgaN:

  The same goes for filters.
+ '''Make your UDFs Algebraic'''
+ Queries that can take advantage of the combiner generally ran much faster (sometimes several
times faster) than the versions that don't. The latest code significantly improves combiner
usage; however, you need to make sure you do your part. If you have a UDF that works on grouped
data and is, by nature, algebraic (meaning their computation can be decomposed into multiple
steps.) make sure you implement it as such. For details on how to write algebraic UDFs, see
+ {{{
+ A = load 'data' as (x, y, z)
+ B = group A by x;
+ C = foreach B generate group, MyUDF(A);
+ ....
+ }}}
+ If `MyUDF` is alrebraic, the query will use combiner and run much faster. You can run `explain`
command on your query to make sure that combiner is used.
  '''Drop Nulls Before a Join'''
  This comment only applies to pig on the types branch, as pig 0.1.0 does not have nulls.

View raw message