hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Pullara (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-40) Memory management in BigDataBag is probably wrong
Date Mon, 03 Dec 2007 17:47:43 GMT

    [ https://issues.apache.org/jira/browse/PIG-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547917

Sam Pullara commented on PIG-40:

The particular notification that I use in my example is evaluated right after a garbage collection
occurs, it should be very accurate.

1) If you want to keep in the old code for a case where you find a VM that doesn't support
that seems fine to me.  That said, it worked on Mac (Sun & soylatte) and Linux (Sun &
Jrockit).  Haven't tested it on Windows but I can't imagine Sun failed to implement it there.
2) Rechecking periodically doesn't seem that bad to me.  In fact there is a polling version
of this in the javadocs: https://java.sun.com/j2se/1.5.0/docs/api/java/lang/management/MemoryPoolMXBean.html
though I would always use the 'Collected' thresholds so you don't get false positives.

I was actually pleasantly surprised to find such great support for doing this stuff in the
VM -- somehow I had missed it in the release notes or at least forgot to check it out.

> Memory management in BigDataBag is probably wrong
> -------------------------------------------------
>                 Key: PIG-40
>                 URL: https://issues.apache.org/jira/browse/PIG-40
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Sam Pullara
>         Attachments: BigDataBag.java, MemoryUsage.java
> src/org/apache/pig/data/BigDataBag.java
> 1) You should not use finalizers for things other than external resources -- using them
here is very dangerous and could inadvertantly lead to deadlocks and object resurrection and
just decreases performance without any advantage.
> 2) Using .freeMemory() the way it is used in this class is broken.  freeMemory() is going
to return a mostly random number between 0 and the real amount.  Adding gc() in here is a
terrible performance burden.  If you really want to do something like this you should using
softreferences and finalization queues.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message