cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2392) Saving IndexSummaries to disk
Date Sun, 22 Jan 2012 19:52:40 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190751#comment-13190751
] 

Pavel Yaskevich commented on CASSANDRA-2392:
--------------------------------------------

bq. But the main idea is to reduce the code and the checks which we have to do just to populate
the first and last variable. IMO it is better served in Index Summary which already has the
needed checks. by using maybeAddEntry() and marking other private everywhere we dont need
extra checks else where to populate the fields... first and last in a index is also a summary
:)

Correct me if I'm wrong but as I see in SSTableReader.load(...) that condition "SSTable.last
== IndexSummary.last" is not a guaranteed thing which means that IndexSummary.last has a different
semantics from SSTable.last. According to checks - I don't see many of those and IndexSummary
in it's current state does not have anything to do with SSTable's last/first variables so
I don't really understand what checks are you talking about? If you really want to be pedantic
about the domain of first/last - I agree that they could belong to the summary of the SSTable
but certainly not to the "index" one :)

bq. Because we read from the disk to populate the Index Summary? If yes i can make sure that
both the patches go into the same release.

Because we would end-up reading more data (e.g. some of the keys and all index and data positions
would be read twice) from different files - primary_index and summary. 
                
> Saving IndexSummaries to disk
> -----------------------------
>
>                 Key: CASSANDRA-2392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2392
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Chris Goffinet
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.1
>
>         Attachments: 0001-re-factor-first-and-last.patch, 0001-save-summaries-to-disk.patch,
0002-save-summaries-to-disk-v2.patch, 0002-save-summaries-to-disk-v3.patch, 0002-save-summaries-to-disk.patch
>
>
> For nodes with millions of keys, doing rolling restarts that take over 10 minutes per
node can be painful if you have 100 node cluster. All of our time is spent on doing index
summary computations on startup. It would be great if we could save those to disk as well.
Our indexes are quite large.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message