cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vijay (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-5357) Query cache
Date Mon, 23 Sep 2013 04:50:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774291#comment-13774291
] 

Vijay edited comment on CASSANDRA-5357 at 9/23/13 4:48 AM:
-----------------------------------------------------------

Hi Jonathan, I have pushed a version with sentinel (might have made it little hackie, but
it works) https://github.com/Vijay2win/cassandra/commits/query_cache_v2.

{quote}
Serializing the entire QueryCacheValue for each lookup is going to kill performance on hot
partitions.
{quote}
It is required because we need to know the query which populated the cache, for example there
can be a named query for Column A, Z which can be followed by a slice query from A to Z and
we might not respond with the right response since B to Y is not in the cache. 

In a separate ticket we can also optimize the above case (and more) cache query's stored,
if thats ok. Example: If the slice with count as 250 is stored we might not need to store
the slice with count of 50 with same range, we can also merge overlapping slices etc.

{quote}
if there's room, that's fine, but exceeding the configured memory budget is Bad
{quote}
Can we do that in a separate ticket?, i believe we can achieve this by implementing a Iterator
which will be similar to SSTableIterator to stream the columns than constructing the ColumnFamily
at once.

Thanks!
                
      was (Author: vijay2win@yahoo.com):
    Hi Jonathan, I have pushed a version with sentinel (might have made it little hackie,
but it works) https://github.com/Vijay2win/cassandra/commits/query_cache_v2.

{quote}
Serializing the entire QueryCacheValue for each lookup is going to kill performance on hot
partitions.
{quote}
It is required because we need to know the query which populated the cache, for example there
can be a named query for Column A, Z which can be followed by a slice query from A to Z and
we might not respond with the right response since B to Y is not in the cache. 

In a separate ticket we can also optimize the above case (and more) cache query's stored,
if thats ok. Example: If the slice with 250 is stored why to also store the slice with 50
in the same range, we can also merge overlapping slices etc.

{quote}
if there's room, that's fine, but exceeding the configured memory budget is Bad
{quote}
Can we do that in a separate ticket?, i believe we can achieve this by implementing a Iterator
which will be similar to SSTableIterator to stream the columns than constructing the ColumnFamily
at once.

Thanks!
                  
> Query cache
> -----------
>
>                 Key: CASSANDRA-5357
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5357
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Vijay
>
> I think that most people expect the row cache to act like a query cache, because that's
a reasonable model.  Caching the entire partition is, in retrospect, not really reasonable,
so it's not surprising that it catches people off guard, especially given the confusion we've
inflicted on ourselves as to what a "row" constitutes.
> I propose replacing it with a true query cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message