hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1519) Stop relying on finalize() to delete files, close filehandles in bag implementations
Date Tue, 27 Jul 2010 00:11:16 GMT

    [ https://issues.apache.org/jira/browse/PIG-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892551#action_12892551
] 

Thejas M Nair commented on PIG-1519:
------------------------------------

As part of these changes, we should consider keeping a (weak?) reference in the bags to all
the iterators that have been created and call clear() (a new method in iterator impl class)
that closes the DataInputStreams and invalidates the iterators.


> Stop relying on finalize() to delete files, close filehandles in bag implementations
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-1519
>                 URL: https://issues.apache.org/jira/browse/PIG-1519
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Priority: Minor
>
> In DefaultAbstractBag and its subclasses, the files used for spilling to disk are deleted
using finalize() . 
> The iterators associated with these bags use DataInputStreams but don't call close on
them, and the underlying FileInputStream.close() is called only through FileInputStream.finalize().
> The use of finalize has performance implications and also makes it hard to predict when
the resources will get freed. 
> WeakReferences can be used to avoid the use of finalize().  See http://java.sun.com/developer/technicalArticles/javase/finalization/
(look for "An Alternative to Finalization") .
> I have marked the priority has minor because the allocation of these resources objects
that have finalize happens only for large bags that spill to disk (see related jira - PIG-1516),
so the performance  impact of the use of finalize is not likely to be significant. Also, I
haven't come across any case where we have run out of these resources because finalize() thread
has not freed them yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message