hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Reed (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-26) distinct does not work on Bags that have spilled to disk.
Date Fri, 09 Nov 2007 20:04:50 GMT

    [ https://issues.apache.org/jira/browse/PIG-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541424
] 

Benjamin Reed commented on PIG-26:
----------------------------------

Yes, we definitely need a independent sweep. We should probably open another ticket for that.

As far as testing goes, the first patch to the unit tests causes a failure in the current
code base and after the distinct.patch is applied.

Here is the fix without the formatting changes:

     @Override
public void distinct() {
-   	sort(null,true);
+        sort(new StarSpec(), true);
       isSorted = true;
   }

> distinct does not work on Bags that have spilled to disk.
> ---------------------------------------------------------
>
>                 Key: PIG-26
>                 URL: https://issues.apache.org/jira/browse/PIG-26
>             Project: Pig
>          Issue Type: Bug
>          Components: data
>    Affects Versions: 0.0.0, 0.1.0, site
>            Reporter: Benjamin Reed
>            Assignee: Benjamin Reed
>         Attachments: distinct-test.patch, distinct.patch
>
>
> If you call distinct on a bag that has spilled to disk, you get the following error:
> java.lang.NullPointerException
>         at org.apache.pig.data.BigDataBag$FileMerger$1.compare(BigDataBag.java:288)
>         at org.apache.pig.data.BigDataBag$FileMerger$1.compare(BigDataBag.java:280)
>         at java.util.PriorityQueue.siftUpUsingComparator(PriorityQueue.java:594)
>         at java.util.PriorityQueue.siftUp(PriorityQueue.java:572)
>         at java.util.PriorityQueue.offer(PriorityQueue.java:274)
>         at java.util.PriorityQueue.add(PriorityQueue.java:251)
>         at org.apache.pig.data.BigDataBag$FileMerger.<init>(BigDataBag.java:304)
>         at org.apache.pig.data.BigDataBag.doSorting(BigDataBag.java:167)
>         at org.apache.pig.data.BigDataBag.content(BigDataBag.java:211)
>         at org.apache.pig.test.TestDataModel.testBigDataBag(TestDataModel.java:343)
>         at org.apache.pig.test.TestDataModel.testBigDataBagOnDisk(TestDataModel.java:210)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message