kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ismael Juma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-3933) Kafka OOM During Log Recovery Due to Leaked Native Memory
Date Thu, 07 Jul 2016 18:11:11 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366526#comment-15366526
] 

Ismael Juma commented on KAFKA-3933:
------------------------------------

Shall I assign the JIRA to you?

> Kafka OOM During Log Recovery Due to Leaked Native Memory
> ---------------------------------------------------------
>
>                 Key: KAFKA-3933
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3933
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
>         Environment: Linux, latest oracle java-8
>            Reporter: Tom Crayford
>            Priority: Critical
>             Fix For: 0.10.0.1
>
>
> Hi there. We've been tracking an issue where Kafka hits an java.lang.OutOfMemoryError
during log recovery.
> After a bunch of tracking work, we've realized we've hit an instance of a long known
issue: http://www.evanjones.ca/java-native-leak-bug.html
> TLDR: Kafka breaks the rule "Always close GZIPInputStream and GZIPOutputStream since
they use native memory via zlib" from that article.
> As such, during broker startup, when you're recovering log segments that have been compressed
with gzip, Kafka leaks `GZIPInputStream` all over the place.
> Our crashes during startup have this profile - the JVM heap is empty (a few hundred MB),
but the offheap memory is full of allocations caused by `Java_java_util_zip_Deflater_init`
and `deflatInit2`.
> This leads to broker crashes during startup. The only real mitigation is having *far*
more memory than you need to boot (which I'd guess is why folk haven't noticed this in production
that much yet).
> To dig into the code more (this is based on trunk). Log recovery on unflushed segments
eventually calls `LogSegment.recover`: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogSegment.scala#L172
> On compressed segments, that leads to a call to `deepIterator`: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogSegment.scala#L189
> That leads to a call to `CompressionFactory`: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/message/ByteBufferMessageSet.scala#L95
which creates a `GZIPInputStream`: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/message/CompressionFactory.scala#L46
> That `GZIPInputStream` is never closed anywhere, and the low *heap* pressure means that
the finalizer on `GZIPInputStream` that deallocates the native buffers is never called, because
GC is never triggered. Instead, we just exhaust the offheap memory and then Kafka dies from
an OutOfMemory error.
> Kafka *does* trigger an `inputstream.close()` call, but only when *fully* reading the
whole input stream (see https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/message/ByteBufferMessageSet.scala#L156).
When it's performing log recovery, in https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogSegment.scala#L189
it doesn't read to the end of the stream, but instead reads the first offset and leaves things
alone.
> This issue likely impacts `lz4` and `snappy` compressed topics in exactly the same way.
I think (but haven't 100% verified) that it impacts all versions of Kafka that are supported
(0.8 -> 0.10).
> Fixing this seems relatively annoying, but only because of some "small matters of coding",
nothing hugely problematic.
> The main issue is that `deepIterator` only returns an `Iterator`, which doesn't have
a `close()` method of any kind. We could create a new `ClosableIterator` trait and have it
extend Java's `AutoCloseable` (https://docs.oracle.com/javase/7/docs/api/java/lang/AutoCloseable.html),
then explicitly call `close()` everywhere we use a `deepIterator()` and don't always read
to the end. Scala unfortunately doesn't seem to have a built in version of Java's `try-with-resources`
statement, but we can explicitly call close everywhere perfectly happily.
> Another (but much more hacky) solution would be to always read to the end of the iterator
in `LogSegment.recover`, but that seems pretty bad, using far more resources than is needed
during recovery.
> I can't think of any other reasonable solutions for now, but would love to hear input
from the community.
> We're happy doing the work of developing a patch, but thought we'd report the issue before
starting down that path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message