pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Ryaboy <dvrya...@gmail.com>
Subject algebraic optimization not invoked for filter following group?
Date Wed, 02 Jun 2010 23:18:09 GMT
It looks like right now, the combiner optimization does not kick in for a
script like this:

data = load 'foo' using PigStorage() as (a, b, c);
grouped = group data by a;
filtered = filter grouped by COUNT(data) < 1000;

Looking at the code in CombinerOptimizer, seems like the Filter bit is just
pseudo-coded in comments. Are there complications there other than what is
already noted, or is it just the matter of coding up the pseudo-code?

On that note -- assuming the optimization was implemented for Filter
following group, would it automagically start working for Splits, as well?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message