incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: A few questions on row caching and read consistency ONE
Date Thu, 18 Aug 2011 11:23:22 GMT
> Q. If we're using read consistency ONE does the read request get sent to all nodes in
the replica set and the first to reply is returned (i.e. all replica nodes will then have
that row in their cache), OR does the request only get sent to a single node in the replica
set? If it's the latter would the same node generally be used for all requests to the same
key or would it always be a random node in the replica set? (i.e. if we have multiple reads
for one key in quick succession would this entail potentially multiple disk lookups until
all nodes in the set have been hit?). 
At CL if all nodes will be involved in the request if Read Repair is active for the request
(this is true for all CL's). if RR is not active for the request only 1 node will be involved.
See read_repair_chance in the yaml file. 

under the simple placement strategy it will be the first node in the replica set, unless the
proximity of nodes has been modified by the dynamic snitch based on recent latency. See the
badness_threshold in the yaml file for info on how to stick requests to a node to improve
cache utilization. 

> Q. Related to the above, if only one node recieves the request would the client (hector
in this case) know which node to send the request to directly or would there be potentially
one extra network hop involved (client -> random node -> node with key).
it's possible, by adding "fat client" nodes to the cluster which do not participate in storage
but can work out where things are. I would try several other optimizations before this one
though. 

> 
> Q. Is it possible to do a warm cache load of the most recently accessed keys on node
startup or would we have to do this with a client app?
See the cache save period settings for a CF described in the help for create column family
in the CLI. 

> Q. With write consistency ANY is it correct that following a write request all nodes
in the replica set will end up with that row in their cache, as well as on disk, once they
receive the write? i.e. total cache size is (cache_memory_per_node * num_nodes) / num_replicas.
ANY means that if all of the natural replicas for a row are unavailable to coordinator node
can store the row it's self with hints to send it on. When using ANY you will not now if the
row ended up in one of the natural endpoints or the coordinator. I'd stay away until you know
you can function with a low level of consistency. 

> Q. If the cluster only has a single column family, random partitioning and no secondary
indexes, is there a good metric for estimating how much heap space we would need to leave
aside for everything that isn't the row-cache? Would it be proportional to the row-cache size
or fairly constant?
You can use the old skool pre 0.8 calculations...http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/

FYI netflix run big (96G ?) memory machines and use a custom cache provider to store the rows
in a node local memcache. This avoids the problems with GC'ings a very big heap. There is
also a pre built native memory row cache provider that stores data off the JVM heap, see row_cache_provider
in the CLI help for create column family.

See the talk from Adrian Cockcroft here http://www.datastax.com/events/cassandrasf2011/presentations
may need to watch the video for the part about using memcache. 

You are are sensitive to read latency also take care with the data model to reduce row fragmentation
across SSTables. (Not an issue for a full row cache)

Hope that helps. If you can provide some numbers on your scale and latency requirements it
would be handy. 

Cheers

 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 18/08/2011, at 9:01 PM, Stephen Henderson wrote:

> Hi,
> 
> We're currently in the planning stage of a new project which needs a low latency, persistent
key/value store with a roughly 60:40 read/write split. We're trying to establish if Cassandra
is a good fit for this and in particular what the hardware requirements would be to have the
majority of rows cached in memory (other nosql platforms like Couchbase/Membase seem like
a more natural fit but we're already reasonably familiar with cassandra and would rather stick
with what we know if it can work). 
> 
> If anyone could help answer/clarify the following questions it would be a great help
(all assume that row-caching is enabled for the column family).
> 
> Q. If we're using read consistency ONE does the read request get sent to all nodes in
the replica set and the first to reply is returned (i.e. all replica nodes will then have
that row in their cache), OR does the request only get sent to a single node in the replica
set? If it's the latter would the same node generally be used for all requests to the same
key or would it always be a random node in the replica set? (i.e. if we have multiple reads
for one key in quick succession would this entail potentially multiple disk lookups until
all nodes in the set have been hit?). 
> 
> Q. Related to the above, if only one node recieves the request would the client (hector
in this case) know which node to send the request to directly or would there be potentially
one extra network hop involved (client -> random node -> node with key).
> 
> Q. Is it possible to do a warm cache load of the most recently accessed keys on node
startup or would we have to do this with a client app?
> 
> Q. With write consistency ANY is it correct that following a write request all nodes
in the replica set will end up with that row in their cache, as well as on disk, once they
receive the write? i.e. total cache size is (cache_memory_per_node * num_nodes) / num_replicas.
> 
> Q. If the cluster only has a single column family, random partitioning and no secondary
indexes, is there a good metric for estimating how much heap space we would need to leave
aside for everything that isn't the row-cache? Would it be proportional to the row-cache size
or fairly constant?
> 
> 
> Thanks,
> Stephen
> 
> 
> Stephen Henderson - Lead Developer (Onsite), Cognitive Match
> stephen.henderson@cognitivematch.com | http://www.cognitivematch.com
> 


Mime
View raw message