cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3143) Global caches (key/row)
Date Wed, 14 Dec 2011 21:35:31 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169719#comment-13169719
] 

Pavel Yaskevich commented on CASSANDRA-3143:
--------------------------------------------

{quote}
At the very least, one easy win would be to save only the keyspace, columnFamily, version
and generation part of the filename, rather than the whole path to the sstable. But otherwise,
when I talked about a descriptor -> id relationship, I was thinking of something simple.
Like saving two files instead of on, one would be the keys with the descriptor replaced by
compact ids, the other would be the metadata, i.e, the descriptor -> id map. That would
really just be some internal detail of the save function. But that's really just an idea.
{quote}

The problem with using only keyspace/cf/generate is that information is not sufficient to
build descriptor back on readSaved. On the other hand, if we will be using descriptor ->
id relationship, wouldn't it create the same amount of additional I/O (+ expenses on such
cache maintain) as just having Descriptor as cache key?

{quote}
Yeah, I know . But for the key cache, we use a constant weighter, counting 8 bytes for each
"entry". Figured we could use some higher constant to get closer to the actual size taken
by each entry in-memory, even if we don't account for the exact size of the key. Typically,
the KeyCacheKey structure will take "at least" 32 bytes in memory (it's more than that but
given there is at least the DK token and a bunch of pointers...), so typically if we were
to consider each entry to be like 40 or 48 bytes, I think we would be closer to the actual
in-memory size. I just want to avoid people configuring 100MB for the key cache (ok, that
would be a huge one) and actually having it being more like 1GB.
{quote}

Sure, I will just change that to 40 bytes and update doc for key_cache_size_in_mb with something
like "please note that actual number of entries for given amount of space is calculated using
following formula: key_cache_size_in_mb * 1024 * 1024 / 48 where 48 = 8 bytes (size of value)
+ 40 bytes (average size of the key)".
                
> Global caches (key/row)
> -----------------------
>
>                 Key: CASSANDRA-3143
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>            Priority: Minor
>              Labels: Core
>             Fix For: 1.1
>
>         Attachments: 0001-global-key-cache.patch, 0002-global-row-cache-and-ASC.readSaved-changed-to-abstra.patch,
0003-CacheServiceMBean-and-correct-key-cache-loading.patch, 0004-key-row-cache-tests-and-tweaks.patch,
0005-cleanup-of-the-CFMetaData-and-thrift-avro-CfDef-and-.patch, 0006-row-key-cache-improvements-according-to-Sylvain-s-co.patch
>
>
> Caches are difficult to configure well as ColumnFamilies are added, similar to how memtables
were difficult pre-CASSANDRA-2006.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message