hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ajay Garg (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-296) UDF for cumulative statistics
Date Tue, 08 Jul 2008 10:20:32 GMT
UDF for cumulative statistics
-----------------------------

                 Key: PIG-296
                 URL: https://issues.apache.org/jira/browse/PIG-296
             Project: Pig
          Issue Type: Improvement
            Reporter: Ajay Garg
            Priority: Minor


udf for computive cumulative sum, row, rank, dense rank. visit http://twiki.corp.yahoo.com/view/YResearch/PigStatisticsCumulative
for detailed description. 

To use 
A = load 'data' using PigStorage as ( query, freq );
B = group A all;
C = foreach B {
    Ordered = order A by freq using numeric.OrderDescending;
    generate
        statistics.CUMULATIVE_COLUMN(Ordered, 1) as   -- Pig starts with 0th column, this
refers to the column freq by offset
                ( query, freq, freq_cumulative_sum, freq_row, freq_rank, freq_dense_rank );
};
D = foreach C generate FLATTEN(A);


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message