datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Steingold (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DATAFU-39) RFE: BagSum
Date Tue, 29 Apr 2014 18:46:15 GMT

    [ https://issues.apache.org/jira/browse/DATAFU-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984647#comment-13984647
] 

Sam Steingold commented on DATAFU-39:
-------------------------------------

Thanks!
what I find unfortunate is that I seem to be unable to do this in a single step (without the
intermediate data1..3).
should I be worried about this?

> RFE: BagSum
> -----------
>
>                 Key: DATAFU-39
>                 URL: https://issues.apache.org/jira/browse/DATAFU-39
>             Project: DataFu
>          Issue Type: New Feature
>            Reporter: Sam Steingold
>
> I need a new function {{BagSum}} which would help me solve the problem described in [http://stackoverflow.com/questions/22945236/how-do-i-accumulate-vectors-into-a-map].
> Test case:
> {code}
>   /**
>   
>   define BagSum datafu.pig.bags.BagSum();
>   
>   data = LOAD 'input' AS (id:int, key:chararray, val:int);
>   describe data;
>   
>   data2 = FOREACH (GROUP data BY id) GENERATE group as id, BagSum(data.(key,val),data.key)
as keys;
>   describe data2;
>   
>   STORE data2 INTO 'output';
>    */
>   @Multiline
>   private String bagSumTest;
>   
>   @Test
>   public void bagSumTest() throws Exception
>   {
>     PigTest test = createPigTestFromString(bagSumTest);
>     writeLinesToFile("input", "(1,A,1)","(1,B,2)","(2,A,3)","(3,A,4)","(1,C,5)","(1,C,6)",
>                      "(3,A,7)","(2,B,8)","(1,A,9)","(2,A,10)");
>     test.runScript();
>     assertOutput(test, "data2", "(1,{(A,10),(B,2),(C,11)})",
>                  "(2,{(A,13),(B,8)})","(3,{(A,11)})");
>   }
> {code}
> Thanks.
> (alternatively, please tell me how to implement this using existing features)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message