db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-6111) OutOfMemoryError using CREATE INDEX on large tables
Date Wed, 11 Dec 2013 14:13:08 GMT

    [ https://issues.apache.org/jira/browse/DERBY-6111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13845415#comment-13845415

Knut Anders Hatlen commented on DERBY-6111:

I took a look at the attached heap dump. I couldn't see any evidence of the page cache growing
beyond 1000 pages. As far as I could see, the page cache in the heap dump had 860 entries
in it.

The top contributors on the heap were:

# byte[] with 4.0MB (which sounds reasonable with close to 1000 pages in the page cache, 4KB
# StoredRecordHeader with 2.8MB
# RecordId with 2.3MB

There's a one-to-one correspondence between the StoredRecordHeader instances and the RecordId

We have seen the problem with StoredRecordHeader taking up unreasonably much space before,
when the page cache is filled with pages that have lots of small rows. Index pages often fit
that description. It happens because the page instances cache the record headers to avoid
unnecessary object allocation when accessing rows on the page. When the rows are small, a
page can hold a larger number of rows, and the number of record headers grows correspondingly.
During a CREATE INDEX operation, the number of index pages in the page cache is probably high,
which may lead to a very high number of record headers even if the page cache size is small.

DERBY-3130 improved the situation by making the StoredRecordHeader instances slimmer, but
they still take up much space if the number of instances is high.

FWIW, the OOME doesn't seem to happen if caching of StoredRecordHeaders is disabled in BasePage.
That is, CREATE INDEX has been running for about an hour with 12MB heap, and it hasn't failed
yet. It hasn't completed either. I don't think we'd want to disable the caching of the record
headers completely, though, because of the potential performance impact.

> OutOfMemoryError using CREATE INDEX on large tables
> ---------------------------------------------------
>                 Key: DERBY-6111
>                 URL: https://issues.apache.org/jira/browse/DERBY-6111
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions:,,
>         Environment: Windows 7, different JREs (1.4.2_09, 1.5.0_11, 7)
>            Reporter: Johannes Stadler
>            Priority: Critical
>              Labels: CREATE, INDEX, OutOfMemory, OutOfMemoryError, derby_triage10_11
>         Attachments: createIndexOOM.zip, java_pid3236.zip
> I'm experiencing OutOfMemoryErorrs when performing a simple CREATE INDEX command on tables
with more than 500,000 rows. 
> The crashes occured not deterministically in our standard environment using 64MByte heap
space. But you can easily reproduce the error using the the repro database attached, when
running it with 12MByte heap space.
> Just start ij with the -Xmx12M JVM argument, connect to the sample db and execute
> I've done some investigation and i was able to track down the error. It occurs in SortBuffer.insert(),
but not as expected in NodeAllocator.newNode() (there is a handler for the OOE), but already
in the call of sortObserver.insertDuplicateKey() or .insertNonDuplicateKey() (where the data
value descriptors are cloned).
> Unfortunately this is not the point to fix it. As i caused the MergeRun (that spills
the buffer to disk) to happen earlier, it did not significantly lower the memory consumption.
Instead it created about 13,000 temp files with only 1KByte size (because of the many files,
performance was inacceptable).
> So i analyzed the heap (using the HeapDumpOnOutOfMemory option) and saw that it's not
the sortbuffer that consumes most of the memory (just few KBytes and about 6% of the memory),
but the ConcurrentCache. Even though the maxSize of the ConcurrentCache was set 1000, the
cache contained about 2,500 elements. I've also attached the heapdump.
> If i'm understanding the concept right, the cache elements are added without regarding
the maxSize and there's a worker thread that runs on low prio, that shrinks the cache from
time to time to 10% of its size.
> I think in this particular case, where memory is getting low, it would be a better idea
to have the cache cleared synchronously and provide more space to the sortBuffer. Maybe that
could be done in the ClockPolicy.insertEntry() in case, that the current size is increasing
the max size by 50%. I'm not already very familiar with the code, so i failed to do so.
> I hope you got all the information you need, if you require any further information,
please let me know.
> Greetings
> Johannes Stadler

This message was sent by Atlassian JIRA

View raw message