cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (CASSANDRA-847) Make the reading half of compactions memory-efficient
Date Tue, 06 Apr 2010 05:13:33 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853711#action_12853711
] 

Stu Hood edited comment on CASSANDRA-847 at 4/6/10 5:12 AM:
------------------------------------------------------------

Alright, after the hiatus to implement byte[] keys, I'm back on this horse.

> 2. Replace ColumnFamily and SuperColumn with ColumnGroup, implementing IColumn,
> and deleting IColumnContainer.
I don't think that nested structures, each with their own iterators is a good idea... especially
when they may be hiding the fact that they are fetching columns from disk. And if they are
not fetching transparently from disk, how do we make this any more memory efficient than the
current approach?

The beauty in the Slice approach is that a List<Slice> can represent any arbitrarily
nested structure you can think of, and yet the Slices are still autonomous.

EDIT: Erased an offtopic point.

> 3. Implement new disk format, read + write, but no compaction yet.
I'm not sure how this is supposed to work: is the idea that we would break backwards compatibility
in trunk, and then restore it later on in your steps 4,5,6?

      was (Author: stuhood):
    Alright, after the hiatus to implement byte[] keys, I'm back on this horse.

> 2. Replace ColumnFamily and SuperColumn with ColumnGroup, implementing IColumn,
> and deleting IColumnContainer.
I don't think that nested structures, each with their own iterators is a good idea... especially
when they may be hiding the fact that they are fetching columns from disk. And if they are
not fetching transparently from disk, how do we make this any more memory efficient than the
current approach?

The beauty in the Slice approach is that a List<Slice> can represent any arbitrarily
nested structure you can think of, and yet the Slices are still autonomous. In the very long
term I could imagine a Memtable being implemented as a SortedMap<ColumnKey,Slice>, where
mutations are resolved into existing Slices, and then each are atomically swapped in order.

> 3. Implement new disk format, read + write, but no compaction yet.
I'm not sure how this is supposed to work: is the idea that we would break backwards compatibility
in trunk, and then restore it later on in your steps 4,5,6?
  
> Make the reading half of compactions memory-efficient
> -----------------------------------------------------
>
>                 Key: CASSANDRA-847
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-847
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Stu Hood
>            Priority: Critical
>             Fix For: 0.7
>
>         Attachments: 0001-Add-structures-that-were-important-to-the-SSTableSca.patch,
0002-Implement-most-of-the-new-SSTableScanner-interface.patch, 0003-Rename-RowIndexedReader-specific-test.patch,
0004-Improve-Scanner-tests-and-separate-SuperCF-handling-.patch, 0005-Add-Scanner-interface-and-a-Filtered-implementation-.patch,
0006-Add-support-for-compaction-of-super-CFs-and-some-tes.patch, 0007-Remove-ColumnKey-bloom-filter-maintenance.patch,
0008-Make-Scanner-extend-Iterator-again.patch, 0009-Make-CompactionIterator-a-ReducingIterator-subclass-.patch,
0010-Alternative-to-ReducingIterator-that-can-return-mult.patch, compaction-bench-847.txt,
compaction-bench-trunk.txt, compaction-bench.patch
>
>
> This issue is the next on the road to finally fixing CASSANDRA-16. To make compactions
memory efficient, we have to be able to perform the compaction process on the smallest possible
chunks that might intersect and contend one-another, meaning that we need a better abstraction
for reading from SSTables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message