cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1117) Clean up MMAP support
Date Tue, 25 May 2010 20:45:33 GMT


Jonathan Ellis commented on CASSANDRA-1117:

In particular, while getting rid of the "spanned entry" index logic is a real improvement,
this patch adds complexity (and poor performance -- creating BRAFs is not free) to the non-mmap'd
case by creating BRAF segments instead of mmap'd ones, rather than treating it as One Big
BRAF.  Non-mmap'd is supposed to be our stable, fall-back path, so I'm ambivalent about that.

It's also not clear to me if this can handle the wide rows I'm introducing for CASSANDRA-16.
 It looks like a single row has to fit in 2GB here, which isn't going to be acceptable.  (Otherwise,
you have to add an if statement on each byte read, to see if you need to skip to the next
segment, which is an approach I tried way back when, but is too slow.)

Re performance, at a guess, I'd say looking up index segment by navigablemap instead of %
is the cause of most of the lost performance (in mmap mode).

> Clean up MMAP support
> ---------------------
>                 Key: CASSANDRA-1117
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Stu Hood
>            Assignee: Gary Dusbabek
>             Fix For: 0.7
>         Attachments: 0001-Use-factory-functions-for-RowIndexedReader.patch, 0002-Add-SegmentedFile-to-abstract-opening-FileDataInputs.patch,
0003-Replace-mmap-file-abstraction-with-SegmentedFile.patch, 0004-Rename-SSTableReaderTest-to-SegmentedFileTest.patch,
> Awareness of MMAP is currently embedded into the SSTableReader implementation and IndexSummary.
A good number of bugs experienced recently have been due to this lack of separation, so it
is ripe for abstraction. Additionally, the current implementation does not provide a good
method for iterating over the segments of a file, which is useful for range queries, and lays
more stable groundwork for #998.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message