kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Igor Calabria (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-6603) Kafka streams off heap memory usage does not match expected values from configuration
Date Fri, 02 Mar 2018 13:38:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-6603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383572#comment-16383572

Igor Calabria commented on KAFKA-6603:

Hey, thanks for the quick reply. I just tested your code in production and it didn't reduce
memory usage for simple aggregations. What really helped me was your pull request, I had some
code that used the same rocksDB iterator and replaced it with something similar to what you
did on the aggregations. code. The improvement was significant(especially for throughput),
I'm attributing the extra memory usage to this inefficient iterator but I still need to do
more testing.

> Kafka streams off heap memory usage does not match expected values from configuration
> -------------------------------------------------------------------------------------
>                 Key: KAFKA-6603
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6603
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 1.0.0
>            Reporter: Igor Calabria
>            Priority: Minor
> Hi, I have a simple aggregation pipeline that's backed by the default state store(rocksdb).
The pipeline works fine except that off heap the memory usage is way higher than expected.
Following the [documention|https://docs.confluent.io/current/streams/developer-guide/config-streams.html#streams-developer-guide-rocksdb-config]
has some effect(memory usage is reduced) but the values don't match at all. 
> The java process is set to run with just `-Xmx300m -Xms300m`  and rocksdb config looks
like this
> {code:java}
> tableConfig.setCacheIndexAndFilterBlocks(true);
> tableConfig.setBlockCacheSize(1048576); //1MB
> tableConfig.setBlockSize(16 * 1024); // 16KB
> options.setTableFormatConfig(tableConfig);
> options.setMaxWriteBufferNumber(2);
> options.setWriteBufferSize(8 * 1024); // 8KB{code}
> To estimate memory usage, I'm using this formula  
> {noformat}
> (block_cache_size + write_buffer_size * write_buffer_number) * segments * partitions{noformat}
> Since my topic has 25 partitions with 3 segments each(it's a windowed store), off heap
memory usage should be about 76MB. What I'm seeing in production is upwards of 300MB, even
taking in consideration  extra overhead from rocksdb compaction threads, this seems a bit
high (especially when the disk usage for all files is just 1GB) 

This message was sent by Atlassian JIRA

View raw message