incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Elias Del Valle <mvall...@gmail.com>
Subject Re: Is Cassandra right for me?
Date Tue, 18 Sep 2012 16:50:54 GMT
You're talking about this project, right?
https://github.com/deanhiller/playorm
I will take a look. However, I don't think using Cassandra's model itself
(with CFs / key-values) would be a problem, I just need to know where the
advantage relies on. By your answer, my guess is it relies on better
performance and more control.

I also saw that if I plan to use Data Stax enterprise to get real time
analytics, my data would need to be stored in Cassandra's usual format. It
would harder for me use PlayOrm if I am planning to use advanced data stax
features, like Solr indexing data on Cassandra without copying columns,
realtime, wouldn't it? I don't know much of this Solr feature yet, but my
understanding today is it wouldn't be aware of the tables I create with
playOrm, just of the column families this framework uses to store the data,
right?




2012/9/18 Hiller, Dean <Dean.Hiller@nrel.gov>

> Until Aaron replies, here are my thoughts on the relational piece…
>
>            If everything in my model fits into a relational database, if
> my data is structured, would it still be a good idea to use Cassandra? Why?
>
> The playOrm project explores exactly this issue……A query on 1,000,000 rows
> in a single partition only took 60ms AND you can do joins with it's S-SQL
> language.  The answer is a resounding YES, you can put relational data in
> cassandra.  The writes are way faster than a DBMS and joins and SQL can be
> just as fast and in many cases FASTER on noSQL IF you partition your data
> properly.  A S-SQL statement looks like so on playOrm
>
> PARTITIONS t(:partitionId) SELECT t FROM Trades as t where t.numShares > 10
>
> You can have as many partitions as you want and a single partition can
> have millions of rows though I would not exceed 10 million probably.
>
> Later,
> Dean
>
> 2012/9/18 aaron morton <aaron@thelastpickle.com<mailto:
> aaron@thelastpickle.com>>
> Also, I saw a presentation which said that if I don't have rows with more
> than a hundred rows in Cassandra, whether I am doing something wrong or I
> shouldn't be using Cassandra.
> I do not agree with that statement. (I read that as rows with ore than a
> hundred _columns_)
>
>
>  *   I need to support a high volume of writes per second. I might have a
> billion writes per hour
>
> Thats about 280K /sec. Netflix did a benchmark that shows 1.1M/sec
> http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
>
>
>  *   I need to write non-structured data that will be processed later by
> hadoop processes to generate structured data from it. Later, I index the
> structured data using SOLR or SOLANDRA, so the data can be consulted by my
> end user application. Is Cassandra recommended for that, or should I be
> thinking in writting directly to HDFS files, for instance? What's the main
> advantage I get from storing data in a nosql service like Cassandra, when
> compared to storing files into HDFS?
>  *
>
> You can query your data using Hadoop easily enough. You may want take a
> look at DSE from  http://datastax.com/ it makes using Hadoop and Solr
> with cassandra easier.
>
>
>  *   If I don't need to perform complicated queries in Cassandra, should I
> store the json-like data just as a column value? I am afraid of doing
> something wrong here, as I would need just to store the json file and some
> more 5 or 6 fields to query the files later.
>  *
>
> Store the data in the way that best supports the read queries you want to
> make. If you always read all the fields, or it's a canonical record of
> events storing as JSON may be best. If you often get a few fields, and
> maybe they are updated, storing each field as a column value may be best.
>
>
>  *   Does it make sense to you to use hadoop to process data from
> Cassandra and store the results in a database, like HBase? Once I have
> structured data, is there any reason I should use Cassandra instead of
> HBase?
>  *
>
> It depends on how many moving parts you are comfortable with. Same for the
> questions about HDFS etc. Start with the smallest about of infrastructure.
>
> Hope that helps.
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/09/2012, at 10:28 AM, Marcelo Elias Del Valle <mvallebr@gmail.com
> <mailto:mvallebr@gmail.com>> wrote:
>
> Hello,
>
>      I am new to Cassandra and I am in doubt if Cassandra is the right
> technology to use in the architecture I am defining. Also, I saw a
> presentation which said that if I don't have rows with more than a hundred
> rows in Cassandra, whether I am doing something wrong or I shouldn't be
> using Cassandra. Therefore, it might be the case I am doing something
> wrong. If you could help me to find out the answer for these questions by
> giving any feedback, it would be highly appreciated.
>      Here is my need and what I am thinking in using Cassandra for:
>
>  *   I need to support a high volume of writes per second. I might have a
> billion writes per hour
>  *   I need to write non-structured data that will be processed later by
> hadoop processes to generate structured data from it. Later, I index the
> structured data using SOLR or SOLANDRA, so the data can be consulted by my
> end user application. Is Cassandra recommended for that, or should I be
> thinking in writting directly to HDFS files, for instance? What's the main
> advantage I get from storing data in a nosql service like Cassandra, when
> compared to storing files into HDFS?
>  *   Usually I will write json data associated to an ID and my hadoop
> processes will process this data to write data to a database. I have two
> doubts here:
>     *   If I don't need to perform complicated queries in Cassandra,
> should I store the json-like data just as a column value? I am afraid of
> doing something wrong here, as I would need just to store the json file and
> some more 5 or 6 fields to query the files later.
>     *   Does it make sense to you to use hadoop to process data from
> Cassandra and store the results in a database, like HBase? Once I have
> structured data, is there any reason I should use Cassandra instead of
> HBase?
>
>      I am sorry if the questions are too dummy, I have been watching a lot
> of videos and reading a lot of documentation about Cassandra, but honestly,
> more I read more I have questions.
>
> Thanks in advance.
>
> Best regards,
> --
> Marcelo Elias Del Valle
> http://mvalle.com - @mvallebr
>
>
>
>
> --
> Marcelo Elias Del Valle
> http://mvalle.com - @mvallebr
>



-- 
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr

Mime
View raw message