kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shuai Lin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5010) Log cleaner crashed with BufferOverflowException when writing to the writeBuffer
Date Thu, 06 Apr 2017 00:26:41 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958095#comment-15958095
] 

Shuai Lin commented on KAFKA-5010:
----------------------------------

For now I can think of a quick fix that may help: to always keep the capacity of the write
buffer twice as much as the read buffer, as what i did in [this commit|https://github.com/scrapinghub/kafka/commit/66b0315681b1cbefae941ba68face7fc7f7baa78].
It's not fixing the problem from the root, but i think it can temporarily fix the write buffer
overflow exception.

> Log cleaner crashed with BufferOverflowException when writing to the writeBuffer
> --------------------------------------------------------------------------------
>
>                 Key: KAFKA-5010
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5010
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 0.10.2.0
>            Reporter: Shuai Lin
>            Priority: Critical
>              Labels: reliability
>             Fix For: 0.11.0.0
>
>
> After upgrading from 0.10.0.1 to 0.10.2.0 the log cleaner thread crashed with BufferOverflowException
when writing the filtered records into the writeBuffer:
> {code}
> [2017-03-24 10:41:03,926] INFO [kafka-log-cleaner-thread-0], Starting  (kafka.log.LogCleaner)
> [2017-03-24 10:41:04,177] INFO Cleaner 0: Beginning cleaning of log app-topic-20170317-20.
(kafka.log.LogCleaner)
> [2017-03-24 10:41:04,177] INFO Cleaner 0: Building offset map for app-topic-20170317-20...
(kafka.log.LogCleaner)
> [2017-03-24 10:41:04,387] INFO Cleaner 0: Building offset map for log app-topic-20170317-20
for 1 segments in offset range [9737795, 9887707). (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,101] INFO Cleaner 0: Offset map for log app-topic-20170317-20 complete.
(kafka.log.LogCleaner)
> [2017-03-24 10:41:07,106] INFO Cleaner 0: Cleaning log app-topic-20170317-20 (cleaning
prior to Fri Mar 24 10:36:06 GMT 2017, discarding tombstones prior to Thu Mar 23 10:18:02
GMT 2017)... (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,110] INFO Cleaner 0: Cleaning segment 0 in log app-topic-20170317-20
(largest timestamp Fri Mar 24 09:58:25 GMT 2017) into 0, retaining deletes. (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,372] ERROR [kafka-log-cleaner-thread-0], Error due to  (kafka.log.LogCleaner)
> java.nio.BufferOverflowException
>         at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206)
>         at org.apache.kafka.common.record.LogEntry.writeTo(LogEntry.java:98)
>         at org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:158)
>         at org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:111)
>         at kafka.log.Cleaner.cleanInto(LogCleaner.scala:468)
>         at kafka.log.Cleaner.$anonfun$cleanSegments$1(LogCleaner.scala:405)
>         at kafka.log.Cleaner.$anonfun$cleanSegments$1$adapted(LogCleaner.scala:401)
>         at scala.collection.immutable.List.foreach(List.scala:378)
>         at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
>         at kafka.log.Cleaner.$anonfun$clean$6(LogCleaner.scala:363)
>         at kafka.log.Cleaner.$anonfun$clean$6$adapted(LogCleaner.scala:362)
>         at scala.collection.immutable.List.foreach(List.scala:378)
>         at kafka.log.Cleaner.clean(LogCleaner.scala:362)
>         at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241)
>         at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> [2017-03-24 10:41:07,375] INFO [kafka-log-cleaner-thread-0], Stopped  (kafka.log.LogCleaner)
> {code}
> I tried different values of log.cleaner.buffer.size, from 512K to 2M to 10M to 128M,
all with no luck: The log cleaner thread crashed immediately after the broker got restarted.
But setting it to 256MB fixed the problem!
> Here are the settings for the cluster:
> {code}
> - log.message.format.version = 0.9.0.0 (we use 0.9 format because have old consumers)
> - log.cleaner.enable = 'true'
> - log.cleaner.min.cleanable.ratio = '0.1'
> - log.cleaner.threads = '1'
> - log.cleaner.io.buffer.load.factor = '0.98'
> - log.roll.hours = '24'
> - log.cleaner.dedupe.buffer.size = 2GB 
> - log.segment.bytes = 256MB (global is 512MB, but we have been using 256MB for this topic)
> - message.max.bytes = 10MB
> {code}
> Given that the size of readBuffer and writeBuffer are exactly the same (half of log.cleaner.io.buffer.size),
why would the cleaner throw a BufferOverflowException when writing the filtered records into
the writeBuffer? IIUC that should never happen because the size of the filtered records should
be no greater that the size of the readBuffer, thus no greater than the size of the writeBuffer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message