kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Bejeck (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-3973) Investigate feasibility of caching bytes vs. records
Date Tue, 26 Jul 2016 23:23:20 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394761#comment-15394761
] 

Bill Bejeck commented on KAFKA-3973:
------------------------------------

[~ijuma]  

I re-ran the tests with no instrumentation using the FALLBACK_UNSAFE enum, the results were
the same if not slower.  The benchmark can be run now with no instrumentation.

> Investigate feasibility of caching bytes vs. records
> ----------------------------------------------------
>
>                 Key: KAFKA-3973
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3973
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: streams
>            Reporter: Eno Thereska
>            Assignee: Bill Bejeck
>             Fix For: 0.10.1.0
>
>         Attachments: CachingPerformanceBenchmarks.java, MemoryLRUCache.java
>
>
> Currently the cache stores and accounts for records, not bytes or objects. This investigation
would be around measuring any performance overheads that come from storing bytes or objects.
As an outcome we should know whether 1) we should store bytes or 2) we should store objects.

> If we store objects, the cache still needs to know their size (so that it can know if
the object fits in the allocated cache space, e.g., if the cache is 100MB and the object is
10MB, we'd have space for 10 such objects). The investigation needs to figure out how to find
out the size of the object efficiently in Java.
> If we store bytes, then we are serialising an object into bytes before caching it, i.e.,
we take a serialisation cost. The investigation needs measure how bad this cost can be especially
for the case when all objects fit in cache (and thus any extra serialisation cost would show).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message