datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DATAFU-39) RFE: BagSum
Date Fri, 25 Apr 2014 20:45:15 GMT

     [ https://issues.apache.org/jira/browse/DATAFU-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sam updated DATAFU-39:
----------------------

    Description: 
I need a new function {{BagSum}} which would help me solve the problem described in [http://stackoverflow.com/questions/22945236/how-do-i-accumulate-vectors-into-a-map].
Test case:
{code}


  /**
  

  define BagSum datafu.pig.bags.BagSum();
  
  data = LOAD 'input' AS (id:int, key:chararray, val:int);
  describe data;
  
  data2 = FOREACH (GROUP data BY id) GENERATE group as id, BagSum(data.(key,val),data.key)
as keys;
  describe data2;
  
  STORE data2 INTO 'output';

   */
  @Multiline
  private String bagSumTest;
  
  @Test
  public void bagSumTest() throws Exception
  {
    PigTest test = createPigTestFromString(bagSumTest);
    writeLinesToFile("input", "(1,A,1)","(1,B,2)","(2,A,3)","(3,A,4)","(1,C,5)","(1,C,6)",
                     "(3,A,7)","(2,B,8)","(1,A,9)","(2,A,10)");
    test.runScript();
    assertOutput(test, "data2", "(1,{(A,10),(B,2),(C,11)})",
                 "(2,{(A,13),(B,8)})","(3,{(A,11)})");
  }
{code}
Thanks.
(alternatively, please tell me how to implement this using existing features)

  was:
I need a new function {{BagSum}} which would help me solve the problem described in [http://stackoverflow.com/questions/22945236/how-do-i-accumulate-vectors-into-a-map].
Test case:
{code}


  /**
  

  define BagSum datafu.pig.bags.BagSum();
  
  data = LOAD 'input' AS (id:int, key:chararray, val:int);
  describe data;
  
  data2 = FOREACH (GROUP data BY id) GENERATE group as id, BagSum(data.(key,val),data.key)
as keys;
  describe data2;
  
  STORE data2 INTO 'output';

   */
  @Multiline
  private String bagSumTest;
  
  @Test
  public void bagSumTest() throws Exception
  {
    PigTest test = createPigTestFromString(bagSumTest);
    writeLinesToFile("input", "(1,A,1)","(1,B,2)","(2,A,3)","(3,A,4)","(1,C,5)","(1,C,6)",
                     "(3,A,7)","(2,B,8)","(1,A,9)","(2,A,10)");
    test.runScript();
    assertOutput(test, "data2", "(1,{(A,10),(B,2),(C,11)})",
                 "(2,{(A,13),(B,8)})","(3,{(A,11)})");
  }
{code}


> RFE: BagSum
> -----------
>
>                 Key: DATAFU-39
>                 URL: https://issues.apache.org/jira/browse/DATAFU-39
>             Project: DataFu
>          Issue Type: New Feature
>            Reporter: Sam
>
> I need a new function {{BagSum}} which would help me solve the problem described in [http://stackoverflow.com/questions/22945236/how-do-i-accumulate-vectors-into-a-map].
> Test case:
> {code}
>   /**
>   
>   define BagSum datafu.pig.bags.BagSum();
>   
>   data = LOAD 'input' AS (id:int, key:chararray, val:int);
>   describe data;
>   
>   data2 = FOREACH (GROUP data BY id) GENERATE group as id, BagSum(data.(key,val),data.key)
as keys;
>   describe data2;
>   
>   STORE data2 INTO 'output';
>    */
>   @Multiline
>   private String bagSumTest;
>   
>   @Test
>   public void bagSumTest() throws Exception
>   {
>     PigTest test = createPigTestFromString(bagSumTest);
>     writeLinesToFile("input", "(1,A,1)","(1,B,2)","(2,A,3)","(3,A,4)","(1,C,5)","(1,C,6)",
>                      "(3,A,7)","(2,B,8)","(1,A,9)","(2,A,10)");
>     test.runScript();
>     assertOutput(test, "data2", "(1,{(A,10),(B,2),(C,11)})",
>                  "(2,{(A,13),(B,8)})","(3,{(A,11)})");
>   }
> {code}
> Thanks.
> (alternatively, please tell me how to implement this using existing features)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message