cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuki Morishita (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6356) Proposal: Statistics.db (SSTableMetadata) format change
Date Fri, 15 Nov 2013 22:27:21 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13824176#comment-13824176
] 

Yuki Morishita commented on CASSANDRA-6356:
-------------------------------------------

Pushed proposal to: https://github.com/yukim/cassandra/tree/6356-v1

New format will have sections separated by type and size(https://github.com/yukim/cassandra/blob/6356-v1/src/java/org/apache/cassandra/io/sstable/metadata/MetadataSerializer.java).
Initially, I created 3 metadata sections(or components), Validation, Compaction and Stats.

* ValidationMetadata: properties only used to validate SSTable before opening(partitioner
name and bloom filter fp chance).
* CompactionMetadata: properties meant to be accessed only on compaction(ancestors).
* StatsMetadata: everything else that are kept in memory.

Note that CompactionMetadata is loaded for "compacting" SSTable. Tombstone drop time histogram
and SSTable level are frequently used to determine compaction candidates, so those are kept
in memory as StatsMetadata.


> Proposal: Statistics.db (SSTableMetadata) format change
> -------------------------------------------------------
>
>                 Key: CASSANDRA-6356
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6356
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Yuki Morishita
>            Assignee: Yuki Morishita
>            Priority: Minor
>             Fix For: 2.1
>
>
> We started to distinguish what's loaded to heap, and what's not from Statistics.db. For
now, ancestors are loaded as they needed.
> Current serialization format is so adhoc that adding new metadata that are not permanently
hold onto memory is somewhat difficult and messy. I propose to change serialization format
so that a group of stats can be loaded as needed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message