cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Goutham reddy <goutham.chiru...@gmail.com>
Subject Re: [EXTERNAL] Re: Good way of configuring Apache spark with Apache Cassandra
Date Wed, 09 Jan 2019 16:28:55 GMT
Thanks Sean. But what if I want to have both Spark and elasticsearch with
Cassandra as separare data center. Does that cause any overhead ?

On Wed, Jan 9, 2019 at 7:28 AM Durity, Sean R <SEAN_R_DURITY@homedepot.com>
wrote:

> I think you could consider option C: Create a (new) analytics DC in
> Cassandra and run your spark nodes there. Then you can address the scaling
> just on that DC. You can also use less vnodes, only replicate certain
> keyspaces, etc. in order to perform the analytics more efficiently.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Dor Laor <dor@scylladb.com>
> *Sent:* Friday, January 04, 2019 4:21 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Good way of configuring Apache spark with
> Apache Cassandra
>
>
>
> I strongly recommend option B, separate clusters. Reasons:
>
>  - Networking of node-node is negligible compared to networking within the
> node
>
>  - Different scaling considerations
>
>    Your workload may require 10 Spark nodes and 20 database nodes, so why
> bundle them?
>
>    This ratio may also change over time as your application evolves and
> amount of data changes.
>
>  - Isolation - If Spark has a spike in cpu/IO utilization, you wouldn't
> want it to affect Cassandra and the opposite.
>
>    If you isolate it with cgroups, you may have too much idle time when
> the above doesn't happen.
>
>
>
>
>
> On Fri, Jan 4, 2019 at 12:47 PM Goutham reddy <goutham.chirutha@gmail.com>
> wrote:
>
> Hi,
>
> We have requirement of heavy data lifting and analytics requirement and
> decided to go with Apache Spark. In the process we have come up with two
> patterns
>
> a. Apache Spark and Apache Cassandra co-located and shared on same nodes.
>
> b. Apache Spark on one independent cluster and Apache Cassandra as one
> independent cluster.
>
>
>
> Need good pattern how to use the analytic engine for Cassandra. Thanks in
> advance.
>
>
>
> Regards
>
> Goutham.
>
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
-- 
Regards
Goutham Reddy

Mime
View raw message