cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2392) Saving IndexSummaries to disk
Date Thu, 19 Jan 2012 10:34:42 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189046#comment-13189046
] 

Pavel Yaskevich commented on CASSANDRA-2392:
--------------------------------------------

Thanks for the patch! Here is my review:

- Index summaries load in SSTableReader.load(boolean, Set<DecoratedKey>) breaks key
cache pre-load.

- IndexSummary deserialize(...) method should be made static and return IndexSummary object.
This will also allow to drop IndexSummary argument from SSTableReader.loadSummaries(...).

- To avoid any seeks in the PRIMARY_INDEX file upon IndexSummary.deserialize I suggest to
save key (only BB part) as well as index position on IndexSummary.serialize.

- I would also suggest to save dataPosition from the primary index into summaries file to
avoid adding serialization to SegmentedFile because SegmentedFile serialize(...)/deserialize(...)
are not really a serialize/deserialize - they just save/read boundaries. This way you would
be able to do deserialization and boundary load at the save time without saving/reading additional
information to/from the disk because only ibuilder needs indexPosition and dbuilder - dataPosition.

- loadSummaries should be renamed to something more appropriate because that method does not
only load index summaries it also loads index and data builders, per se it does not really
load them but rather just deserializes boundaries into an existing object with is not a good
practice.

- can you please explain this chunk of code to me?
{code}
+            // don't rename summaries as it is not created yet and created while it is loaded.
+            for (Component component : Sets.difference(components, Sets.newHashSet(Component.DATA,
Component.SUMMARIES)))
                  FBUtilities.renameWithConfirm(tmpdesc.filenameFor(component), newdesc.filenameFor(component));
{code}


                
> Saving IndexSummaries to disk
> -----------------------------
>
>                 Key: CASSANDRA-2392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2392
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Chris Goffinet
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.1
>
>         Attachments: 0001-re-factor-first-and-last.patch, 0001-save-summaries-to-disk.patch,
0002-save-summaries-to-disk.patch
>
>
> For nodes with millions of keys, doing rolling restarts that take over 10 minutes per
node can be painful if you have 100 node cluster. All of our time is spent on doing index
summary computations on startup. It would be great if we could save those to disk as well.
Our indexes are quite large.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message