pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheolsoo Park (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-3466) Race Conditions in InternalDistinctBag during proactive spill
Date Wed, 18 Sep 2013 15:11:54 GMT

    [ https://issues.apache.org/jira/browse/PIG-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770855#comment-13770855
] 

Cheolsoo Park commented on PIG-3466:
------------------------------------

The patch includes several whitespace changes. I uploaded it to the RB for your convenience:
https://reviews.apache.org/r/14206/
                
> Race Conditions in InternalDistinctBag during proactive spill
> -------------------------------------------------------------
>
>                 Key: PIG-3466
>                 URL: https://issues.apache.org/jira/browse/PIG-3466
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.11.1
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.12
>
>         Attachments: PIG-3466-1.patch
>
>
> I have several jobs that use the following pattern:
> {code}
> b = group a by x;
> c = foreach b {
>             dist_y = DISTINCT a.y;
>             generate
>             group,
>             COUNT(dist_y) as y_cnt;
> };
> {code}
> These job fail intermittently during  proactive spill when the data set is large:
> {code}
> java.util.ConcurrentModificationException
>         at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
>         at java.util.HashMap$KeyIterator.next(HashMap.java:828)
>         at java.util.AbstractCollection.toArray(AbstractCollection.java:171)
>         at org.apache.pig.data.SortedSpillBag.proactive_spill(SortedSpillBag.java:77)
>         at org.apache.pig.data.InternalDistinctBag.spill(InternalDistinctBag.java:464)
>         at org.apache.pig.impl.util.SpillableMemoryManager.handleNotification(SpillableMemoryManager.java:274)
>         at sun.management.NotificationEmitterSupport.sendNotification(NotificationEmitterSupport.java:138)
>         at sun.management.MemoryImpl.createNotification(MemoryImpl.java:171)
>         at sun.management.MemoryPoolImpl$PoolSensor.triggerAction(MemoryPoolImpl.java:272)
>         at sun.management.Sensor.trigger(Sensor.java:120)
> {code}
> PIG-3212 fixed the same issue for *InternalSortedBag* by synchronizing accesses to the
content of bag. But *InternalDistinctBag* wasn't fixed, so the issue remains for nested DISTINCT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message