datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Hayes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DATAFU-114) Make FirstTupleFromBag implement Accumulator
Date Fri, 05 Feb 2016 00:01:39 GMT

    [ https://issues.apache.org/jira/browse/DATAFU-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133366#comment-15133366
] 

Matthew Hayes commented on DATAFU-114:
--------------------------------------

Sorry for the late response.  The change looks reasonable to me.  There should be a test for
this though (understandable there isn't one since you couldn't build it).  I went ahead and
wrote one below.  If this test looks reasonable to you I'll commit both pieces of code.  I'm
taking a look at DATAFU-95.  

{code}
  @Test
  public void firstTupleFromBagAccumulateTest() throws Exception
  {
    TupleFactory tf = TupleFactory.getInstance();
    BagFactory bf = BagFactory.getInstance();
 
    FirstTupleFromBag op = new FirstTupleFromBag();
    
    Tuple defaultValue = tf.newTuple(1000);
    op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(4))),
defaultValue)));
    op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(9))),
defaultValue)));
    op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(16))),
defaultValue)));
    assertEquals(op.getValue(), tf.newTuple(4));
    op.cleanup();
    
    op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(11))),
defaultValue)));
    op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(17))),
defaultValue)));
    op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(5))),
defaultValue)));
    assertEquals(op.getValue(), tf.newTuple(11));
    op.cleanup();
    
    op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(), defaultValue)));
    assertEquals(op.getValue(), defaultValue);
    op.cleanup();
  }
{code}

> Make FirstTupleFromBag implement Accumulator
> --------------------------------------------
>
>                 Key: DATAFU-114
>                 URL: https://issues.apache.org/jira/browse/DATAFU-114
>             Project: DataFu
>          Issue Type: Improvement
>    Affects Versions: 1.3.0
>         Environment: All
>            Reporter: Eyal Allweil
>            Priority: Minor
>              Labels: easyfix, newbie, performance
>         Attachments: FirstTupleFromBag.java
>
>
> FirstTupleFromBag only needs the first tuple from the bag, but because it doesn't implement
Accumulator the entire bag needs to be passed to it in-memory. The fix is very minor and will
make the UDF support large bags.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message