cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Takayuki Tsunakawa" <>
Subject Re: [Q] MapReduce behavior and Cassandra's scalability for petabytes of data
Date Tue, 26 Oct 2010 00:49:25 GMT
Hello, Jonathan,

From: "Jonathan Ellis" <>
> There is no reason Cassandra cannot scale to 1000s or more nodes
> the current architecture.

Oh, really, I got an impression that the gossip exchanges limit the
number of nodes in a cluster when I read the Dynamos's paper and
"Cassandra - A Decentralized Structured Storage System" written by
Avinash Lakshman. As i quoted in my first mail, Amazon says that
Dynamo is designed to scale to a couple hundreds of nodes, not
thousands. In addition, previously mentioned paper on Cassandra writes
as follows (though this does not directly say that Cassandra does not
scale to more than a thousand nodes):

"Cassandra aims to run on top of an infrastructure of hundreds of
nodes (possibly spread across different data centers)."

Thank you so much for taking your precious time for me. I would
appreciate if you cloud give me your thoughts if you remember some
technical challenges that could cause difficulties in a cluster which
has petabytes of data and thousands of nodes.

Takayuki Tsunakawa

View raw message