groovy-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacopo Cappellato <jacopo.cappell...@gmail.com>
Subject Re: Groovy Script Memory Management Anti-patterns
Date Tue, 22 Nov 2016 10:47:10 GMT
Hi Daniel,

I think that the problem you are facing is that when your code calls:

List datalist = sourceDB.rows(...)

all the records in the resultset are retrieved by Groovy and added to the
datalist List (implemented by an ArrayList).
If the resultset is large (e.g. 2GB) then the memory allocated by the JVM
will not be enough to hold in memory the whole ArrayList.
A simple solution may be that of replacing the call to rows(...) with a
call to eachRow:

sql.eachRow('select * from ...') { row ->
    // add your data manipulation here and insert to the other db
}

With the above code you will load into memory only one row at a time and
you will be able to process a large resultset.

Regards,

Jacopo

On Mon, Nov 21, 2016 at 9:52 PM, Daniel Price <danprice303@gmail.com> wrote:

> Good afternoon, all.  I've a Groovy script with what might be a code
> caused memory leak, but I can't find the cause.  Basically, I'm using a
> script to take 2 GB chunks of data from a SQL Server DB, manipulate it,
> then insert it into a different SQL Server DB.  I've a few scripts that do
> this using Groovy SQL, and the others work well, but this one always ends
> in a Java heap OOME.  The difference, I think, is that this script is
> migrating a lot more data (TBs), so it runs much longer.
>
> I've done heap dump analysis, but I'm not ready to get into the details of
> that yet.  Seems the script might be holding onto every invocation of:
>
> List datalist = sourceDB.rows("${Sql.expand getDataQuery}")
>
> even though I've been careful to use static variables as well as dataList
> = null...
>
> I'm starting to think I'm missing something fundamental, rather than just
> a coding error.
>
> Any advice appreciated...
>
> D
>

Mime
View raw message