spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Rosen <rosenvi...@gmail.com>
Subject Re: Fwd: Accumulator question
Date Thu, 09 Oct 2014 21:26:49 GMT
Hi Nathan,

You’re right, it looks like we don’t currently provide a method to unregister accumulators.
 I’ve opened a JIRA to discuss a fix: https://issues.apache.org/jira/browse/SPARK-3885

In the meantime, here’s a workaround that might work:  Accumulators have a public setValue()
method that can be called (only by the driver) to change an accumulator’s value.  You might
be able to use this to reset accumulators’ values to smaller objects (e.g. the “zero”
object of whatever your accumulator type is, or ‘null’ if you’re sure that the accumulator
will never be accessed again).

Hope this helps,
Josh

On October 8, 2014 at 2:54:33 PM, Nathan Kronenfeld (nkronenfeld@oculusinfo.com) wrote:

I notice that accumulators register themselves with a private Accumulators  
object.  

I don't notice any way to unregister them when one is done.  

Am I missing something? If not, is there any plan for how to free up that  
memory?  

I've a case where we're gathering data from repeated queries using some  
relatively sizable accumulators; at the moment, we're creating one per  
query, and running out of memory after far too few queries.  

I've tried methods that don't involve accumulators; they involve a shuffle  
instead, and take 10x as long.  

Thanks,  
-Nathan  




--  
Nathan Kronenfeld  
Senior Visualization Developer  
Oculus Info Inc  
2 Berkeley Street, Suite 600,  
Toronto, Ontario M5A 4J5  
Phone: +1-416-203-3003 x 238  
Email: nkronenfeld@oculusinfo.com  

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message