cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Coli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5348) Remove on-heap row cache
Date Mon, 14 Oct 2013 23:36:43 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794634#comment-13794634
] 

Robert Coli commented on CASSANDRA-5348:
----------------------------------------

I understand and agree with the idea of removing the Row Cache as likely to be hazardous to
most users.

I do not, however, understand removing the on-heap Row Cache and keeping the off-heap one.

Problems with on-heap cache :

1) if you make it too big, you consume too much heap

Problems with off-heap cache :

1) still consumes heap despite being off-heap, including marginal heap on each read/write
2) serialize-deserialize penalty on read/write
3) invalidates on write

Other than the fact that the on-heap is more likely to cause you problems by running out of
heap if it is too large, it seems on its face to be a better implementation of the row cache
concept than the off-heap row cache. If we already accept that the Row Cache is for use by
people who know what they are doing... aren't those users likely to actually prefer the on-heap
cache, especially in 2.0 where heap pressure is the least severe it has ever been? Is there
something I'm missing about what makes the on-heap cache so bad?

tl;dr : I +1 sylvain's comments above, but with some questions re on-heap vs. off-heap.

> Remove on-heap row cache
> ------------------------
>
>                 Key: CASSANDRA-5348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5348
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 2.0 beta 1
>
>         Attachments: 5348.txt
>
>
> The row (partition) cache easily does more harm than good.  People expect it to act like
a query cache but it is very different than that, especially for the wide partitions that
are so common in Cassandra data models.
> Making it off-heap by default only helped a little; we still have to deserialize the
partition to the heap to query it.
> Ultimately we can add a better cache based on the ideas in CASSANDRA-1956 or CASSANDRA-2864,
but even if we don't get to that until 2.1, removing the old row cache for 2.0 is a good idea.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message