kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dustin Cote (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4169) Calculation of message size is too conservative for compressed messages
Date Wed, 14 Sep 2016 13:04:20 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15490387#comment-15490387

Dustin Cote commented on KAFKA-4169:

Ah, good point.  The user reporting this issue suggested pushing the check into the RecordAccumulator.
 That might be a better option.

> Calculation of message size is too conservative for compressed messages
> -----------------------------------------------------------------------
>                 Key: KAFKA-4169
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4169
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>    Affects Versions:
>            Reporter: Dustin Cote
> Currently the producer uses the uncompressed message size to check against {{max.request.size}}
even if a {{compression.type}} is defined.  This can be reproduced as follows:
> {code}
> # dd if=/dev/zero of=/tmp/outsmaller.dat bs=1024 count=1000
> # cat /tmp/out.dat | bin/kafka-console-producer --broker-list localhost:9092 --topic
tester --producer-property compression.type=gzip
> {code}
> The above code creates a file that is the same size as the default for {{max.request.size}}
and the added overhead of the message pushes the uncompressed size over the limit.  Compressing
the message ahead of time allows the message to go through.  When the message is blocked,
the following exception is produced:
> {code}
> [2016-09-14 08:56:19,558] ERROR Error when sending message to topic tester with key:
null, value: 1048576 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
> org.apache.kafka.common.errors.RecordTooLargeException: The message is 1048610 bytes
when serialized which is larger than the maximum request size you have configured with the
max.request.size configuration.
> {code}
> For completeness, I have confirmed that the console producer is setting {{compression.type}}
properly by enabling DEBUG so this appears to be a problem in the size estimate of the message
itself.  I would suggest we compress before we serialize instead of the other way around to
avoid this.

This message was sent by Atlassian JIRA

View raw message