cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vijay (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-1956) Convert row cache to row+filter cache
Date Fri, 10 Feb 2012 19:32:59 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205673#comment-13205673
] 

Vijay commented on CASSANDRA-1956:
----------------------------------

Alright, What i was trying to do here is to get the feedback from everyone on all the use
cases and try to fit it into the one cache,

I did some fair amount of research to see if there is any better option and there wasn't one,
the closest concept which i got to was something like a block cache or Linux Page cache....
When there is updates to those blocks we can find those and update those.

1) The problem shows up only when you have a wide row, which means most probably the user
is doing a range queries
2) If the user has a wide row then most probably he has a large number of writes into the
row, but if we invalidate the row cache for every updates then it might not be useful and
also the first read will have to read multiple SST's.
3) Lets say user has a 100 columns to query and he queries in this case (specially with composite
type columns where the column names can be larger than the value), then we can possibly run
into memory pressure.
4) Having whole row in memory is absolutely required case and we are supporting it (setting
min and max number of columns in a block will help it).
5) the above solution can work seamlessly well for narrow rows when the block size is reasonably
big.

Head and Tail is basically a optimization for the Reverse/Forward queries which is supported
if you have 1 M rows and your block size is 500 and your count is 100 and you are reading
from reverse.
                
> Convert row cache to row+filter cache
> -------------------------------------
>
>                 Key: CASSANDRA-1956
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1956
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: 0001-1956-cache-updates-v0.patch, 0001-commiting-block-cache.patch,
0001-re-factor-row-cache.patch, 0001-row-cache-filter.patch, 0002-1956-updates-to-thrift-and-avro-v0.patch,
0002-add-query-cache.patch
>
>
> Changing the row cache to a row+filter cache would make it much more useful. We currently
have to warn against using the row cache with wide rows, where the read pattern is typically
a peek at the head, but this usecase would be perfect supported by a cache that stored only
columns matching the filter.
> Possible implementations:
> * (copout) Cache a single filter per row, and leave the cache key as is
> * Cache a list of filters per row, leaving the cache key as is: this is likely to have
some gotchas for weird usage patterns, and it requires the list overheard
> * Change the cache key to "rowkey+filterid": basically ideal, but you need a secondary
index to lookup cache entries by rowkey so that you can keep them in sync with the memtable
> * others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message