cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Elias Del Valle <>
Subject Re: Is Cassandra right for me?
Date Tue, 18 Sep 2012 13:52:45 GMT
I will have just 6 columns in my CF, but I will have about a billion writes
per hour. In this case, I think Cassandra applies then, by what you are
This answer helped a lot too, thanks!

2012/9/18 Hiller, Dean <>

> I wanted to clarify the where that statement comes from on wide rows ….
> Realize some people make the claim that if you don’t' have 1000's of
> columns in "some" rows in cassandra you are doing something wrong.  This is
> not true, BUT it comes from the fact that people are setting up indexes.
>  This is what leads to the very wide row affect.  playOrm is one such
> library using wide rows like this BUT it is NOT necessary for all
> applications.
> You can easily use map/reduce on a cassandra cluster.  You can map/reduce
> your dataset into a new model if you make a mistake as well and don't get
> it right the first time.  This wide row affect is 80% of the time used for
> indexing.  I draw off playOrm examples a lot but one table may be
> partitioned by time so each month of data is in a partition, you can then
> have indexes on each partition allowing you to do quick queries into
> partitions.
> Later,
> Dean
> From: Marcelo Elias Del Valle <<mailto:
> Reply-To: "<>" <
> Date: Monday, September 17, 2012 4:28 PM
> To: "<>" <
> Subject: Is Cassandra right for me?
> Hello,
>      I am new to Cassandra and I am in doubt if Cassandra is the right
> technology to use in the architecture I am defining. Also, I saw a
> presentation which said that if I don't have rows with more than a hundred
> rows in Cassandra, whether I am doing something wrong or I shouldn't be
> using Cassandra. Therefore, it might be the case I am doing something
> wrong. If you could help me to find out the answer for these questions by
> giving any feedback, it would be highly appreciated.
>      Here is my need and what I am thinking in using Cassandra for:
>  *   I need to support a high volume of writes per second. I might have a
> billion writes per hour
>  *   I need to write non-structured data that will be processed later by
> hadoop processes to generate structured data from it. Later, I index the
> structured data using SOLR or SOLANDRA, so the data can be consulted by my
> end user application. Is Cassandra recommended for that, or should I be
> thinking in writting directly to HDFS files, for instance? What's the main
> advantage I get from storing data in a nosql service like Cassandra, when
> compared to storing files into HDFS?
>  *   Usually I will write json data associated to an ID and my hadoop
> processes will process this data to write data to a database. I have two
> doubts here:
>     *   If I don't need to perform complicated queries in Cassandra,
> should I store the json-like data just as a column value? I am afraid of
> doing something wrong here, as I would need just to store the json file and
> some more 5 or 6 fields to query the files later.
>     *   Does it make sense to you to use hadoop to process data from
> Cassandra and store the results in a database, like HBase? Once I have
> structured data, is there any reason I should use Cassandra instead of
> HBase?
>      I am sorry if the questions are too dummy, I have been watching a lot
> of videos and reading a lot of documentation about Cassandra, but honestly,
> more I read more I have questions.
> Thanks in advance.
> Best regards,
> --
> Marcelo Elias Del Valle
> - @mvallebr

Marcelo Elias Del Valle - @mvallebr

View raw message