cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elliott Sims <elli...@backblaze.com>
Subject Re: Mongo DB vs Cassandra
Date Sat, 02 Jun 2018 00:38:39 GMT
I'd say for a large write-heavy workload like, Cassandra is a pretty clear
winner over MongoDB.  I agree with the commenters about understanding your
query patterns a bit better before choosing, though.  Cassandra's queries
are a bit limited, and if you're loading all new data every day and
discarding the old you might run into some significant tombstone issues.

It's worth looking into various other storage systems depending on your
exact needs, like S3, B2 (OK, I'm biased there), or possibly Spark or
Hadoop.  Cassandra's phenomenal at scaling to large write workloads, but
the data and query model isn't well-suited to all applications. It can also
be a bit... administration-intensive, though the same can be said about
MongoDB and Hadoop.

On Thu, May 31, 2018 at 11:17 AM, Joseph Arriola <jcarriolaa@gmail.com>
wrote:

> Based on the metrics you say, I think the big data architecture can be:
> cassandra with spark. you mention high availability. the apis could use
> node.js. This combination is powerful, the challenge is in the data model.
>
> On the other hand, if you are willing to sacrifice high availability and
> slow response time, mongodb can be easier to implement.
>
>
>
> El El jue, 31 de may. de 2018 a las 10:01 a. m., Sudhakar Ganesan <
> sudhakar.ganesan@flex.com.invalid> escribió:
>
>> At high level, in the production line, machine will provide the data in
>> the form of CSV in every 1 sec to 1 minutes to 1 day ( depending on machine
>> type used in the line operations). I need to parse those files and load it
>> to DB and build and API layer expose it to downstream systems.
>>
>>
>>
>> *Number of files to be processed   13,889,660,134  per day*
>>
>> *Each file could range from 20 KB to 600MB which will translate into few
>> hundred rows to millions of rows.*
>>
>> *High availability with high write. Read is less compare to write.*
>>
>> *While extracting the rows, few validation to be performed.*
>>
>> *Build an API layer on top of the data to be persisted in the DB.*
>>
>>
>>
>> Now, tell me what would be the best choice…
>>
>>
>>
>> *From:* Russell Bateman [mailto:russ@windofkeltia.com]
>> *Sent:* Thursday, May 31, 2018 7:36 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Mongo DB vs Cassandra
>>
>>
>>
>> Sudhakar,
>>
>> MongoDB will accommodate loading CSV without regard to schema while still
>> creating identifiable "columns" in the database, but you'll have to predict
>> or back-impose some schema later if you're going to create indices for fast
>> searching of the data. You can perform searching of data without indexing
>> in MongoDB, but it's slower.
>>
>> Cassandra will require you to understand the schema, i.e.: what the
>> columns are up front unless you're just going to store the data without
>> schema and, therefore, without ability to search effectively.
>>
>> As suggested already, you should share more detail if you want good
>> advice. Both DBs are excellent. Both do different things in different ways.
>>
>> Hope this helps,
>> Russ
>>
>> On 05/31/2018 05:49 AM, Sudhakar Ganesan wrote:
>>
>> Team,
>>
>>
>>
>> I need to make a decision on Mongo DB vs Cassandra for loading the csv
>> file data and store csv file as well. If any of you did such study in last
>> couple of months, please share your analysis or observations.
>>
>>
>>
>> Regards,
>>
>> Sudhakar
>>
>> Legal Disclaimer :
>> The information contained in this message may be privileged and
>> confidential.
>> It is intended to be read only by the individual or entity to whom it is
>> addressed
>> or by their designee. If the reader of this message is not the intended
>> recipient,
>> you are on notice that any distribution of this message, in any form,
>> is strictly prohibited. If you have received this message in error,
>> please immediately notify the sender and delete or destroy any copy of
>> this message!
>>
>>
>>
>

Mime
View raw message