cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <>
Subject Re: map reduce for Cassandra
Date Tue, 22 Jul 2014 01:09:24 GMT
On Mon, Jul 21, 2014 at 5:45 PM, Marcelo Elias Del Valle <> wrote:

> Although several sstables (disk fragments) may have the same row key,
> inside a single sstable row keys and column keys are indexed, right?
> Otherwise, doing a GET in Cassandra would take some time.
> From the M/R perspective, I was reffering to the mem table, as I am trying
> to compare the time to insert in Cassandra against the time of sorting in
> hadoop.

I was confused, because unless you are using new "in-memory"
columnfamilies, which I believe are only available in DSE, there is no way
to ensure that any given row stays in a memtable. Very rarely is there a
view of the function of a memtable that only cares about its properties and
not the closely related properties of SSTables. However yours is one of
them, I see now why your question makes sense, you only care about the
memtable for how quickly it sorts.

But if you are only relying on memtables to sort writes, that seems like a
pretty heavyweight reason to use Cassandra?

I'm certainly not an expert in this area of Cassandra... but Cassandra, as
a datastore with immutable data files, is not typically a good choice for
short lived intermediate result sets... are you planning to use DSE?


View raw message