cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1117) Clean up MMAP support
Date Thu, 27 May 2010 19:31:36 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872334#action_12872334
] 

Stu Hood commented on CASSANDRA-1117:
-------------------------------------

I got to thinking about Jonathan's 2-level binary search idea, and realized that a multiple
level binary search would be handled really well by a tree.

The tree I'm imagining would be a tree of depth K+2 where K is the number of index/data files
(2 in our current situation). The 0th level would be a root. At each of the K levels after
the root, you would have inner nodes representing the segments of the index/data file at that level.
The 1st level would contain the segments for the smallest file, the 2nd level would contain
the segments for the second smallest, and the Kth would contain the segments for the data
file. The K+1th level would contain leaf nodes which would be equivalent to the contents of
the IndexSummary class.

I thiiink I can implement this structure over the weekend if it sounds worthwhile?

Also, generalizing to multiple levels of indexing means that at some point in the future,
we could write out multiple index files at progressively higher resolution, giving you a balanced
tree on disk. Our INDEX_INTERVAL is intended to represent the ratio between ram and disk,
so theoretical you should always have enough memory to summarize the index in memory, but in
most cases, a lot of that memory would be better served as row cache.

> Clean up MMAP support
> ---------------------
>
>                 Key: CASSANDRA-1117
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1117
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Stu Hood
>            Assignee: Gary Dusbabek
>             Fix For: 0.7
>
>         Attachments: 0001-Use-factory-functions-for-RowIndexedReader.patch, 0002-Add-SegmentedFile-to-abstract-opening-FileDataInputs.patch,
0003-Replace-mmap-file-abstraction-with-SegmentedFile.patch, 0004-Rename-SSTableReaderTest-to-SegmentedFileTest.patch,
0005-Remove-filename-munging.patch
>
>
> Awareness of MMAP is currently embedded into the SSTableReader implementation and IndexSummary.
A good number of bugs experienced recently have been due to this lack of separation, so it
is ripe for abstraction. Additionally, the current implementation does not provide a good
method for iterating over the segments of a file, which is useful for range queries, and lays
more stable groundwork for #998.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message