cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DuyHai Doan <doanduy...@gmail.com>
Subject Re: cassandra + spark / pyspark
Date Wed, 10 Sep 2014 18:26:26 GMT
Stupid question: do you really need both Storm & Spark ? Can't you
implement the Storm jobs in Spark ? It will be operationally simpler to
have less moving parts. I'm not saying that Storm is not the right fit, it
may be totally suitable for some usages.

 But if you want to avoid the SPOF thing and don't want to bring in
resource management frameworks, the Spark/Cassandra integration is an
interesting alternative.


On Wed, Sep 10, 2014 at 8:20 PM, Oleg Ruchovets <oruchovets@gmail.com>
wrote:

> Interesting things actually:
>    We have hadoop in our eco system. It has single point of failure and I
> am not sure about inter  data center replication.
>  Plan is to use cassandra - no single point of failure , there is data
> center replication.
> For aggregation/transformation using SPARK. BUT storm requires mesos which
> has SINGLE POINT of failure ( and it will require the same maintenance like
> with secondary name node with hadoop) :-) :-).
>
> Question : is there a way to have storage and processing without single
> point of failure and inter data center replication ?
>
> Thanks
> Oleg.
>
> On Thu, Sep 11, 2014 at 2:09 AM, DuyHai Doan <doanduyhai@gmail.com> wrote:
>
>> "As far as I know, the Datastax connector uses thrift to connect Spark
>> with Cassandra although thrift is already deprecated, could someone confirm
>> this point?"
>>
>> --> the Scala connector is using the latest Java driver, so no there is
>> no Thrift there.
>>
>>  For the Java version, I'm not sure, have not looked into it but I think
>> it also uses the new Java driver
>>
>>
>> On Wed, Sep 10, 2014 at 7:27 PM, Francisco Madrid-Salvador <
>> pmadrid@stratio.com> wrote:
>>
>>> Hi Oleg,
>>>
>>> Stratio Deep is just a library you must include in your Spark deployment
>>> so it doesn't guarantee any high availability at all. To achieve HA you
>>> must use Mesos or any other 3rd party resource manager.
>>>
>>> Stratio doesn't currently support PySpark, just Scala and Java. Perhaps
>>> in the future...
>>>
>>> It should be ready for production use, but like always please test
>>> before on a testing environment ;-)
>>>
>>> As far as I know, the Datastax connector uses thrift to connect Spark
>>> with Cassandra although thrift is already deprecated, could someone confirm
>>> this point?
>>>
>>> Paco
>>>
>>
>>
>

Mime
View raw message