cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-847) Make the reading half of compactions memory-efficient
Date Mon, 29 Mar 2010 14:27:27 GMT


Jonathan Ellis commented on CASSANDRA-847:

Why do we need to change FDI first?  Treating keys as bytes shouldn't require a data format
change.  Remember we already have a method for interpreting on-disk keys differently in the
IPartitioner code.  In fact treating it only as a format change is wrong, since we need to
preserve the old ordering, not just read it once and rewrite in a different order, since changing
the comparison order could change the nodes that data is supposed to be on.

ISTM that the easiest way to make the String -> byte[] change is to make a new OPP that
is strictly byte-oriented, update old OPP to preserves the old utf8-based ordering, add a
getFilterBytes to DecoratedKey, and have BloomFilter.add take a DK instead of a String (the
only caller of add(String) already has a DK object so no problem there).

So, DK will change to (Token, byte[]) instead of (Token, String), COPP will become (BytesToken,
byte[]), decorating from bytes to bytes w/ the different collation order, old OPP will become
(StringToken, byte[]), RP will become (BigIntToken, byte[]), add a new BytesOPP (BytesToken,
byte[]) where the decoration is a no-op the way current OPP behaves.

> Make the reading half of compactions memory-efficient
> -----------------------------------------------------
>                 Key: CASSANDRA-847
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Stu Hood
>            Priority: Critical
>             Fix For: 0.7
>         Attachments: 0001-Add-structures-that-were-important-to-the-SSTableSca.patch,
0002-Implement-most-of-the-new-SSTableScanner-interface.patch, 0003-Rename-RowIndexedReader-specific-test.patch,
0004-Improve-Scanner-tests-and-separate-SuperCF-handling-.patch, 0005-Add-Scanner-interface-and-a-Filtered-implementation-.patch,
0006-Add-support-for-compaction-of-super-CFs-and-some-tes.patch, 0007-Remove-ColumnKey-bloom-filter-maintenance.patch,
0008-Make-Scanner-extend-Iterator-again.patch, 0009-Make-CompactionIterator-a-ReducingIterator-subclass-.patch,
0010-Alternative-to-ReducingIterator-that-can-return-mult.patch, compaction-bench-847.txt,
compaction-bench-trunk.txt, compaction-bench.patch
> This issue is the next on the road to finally fixing CASSANDRA-16. To make compactions
memory efficient, we have to be able to perform the compaction process on the smallest possible
chunks that might intersect and contend one-another, meaning that we need a better abstraction
for reading from SSTables.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message