cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaspar Muñoz <gmu...@stratio.com>
Subject Re: best supported spark connector for Cassandra
Date Fri, 13 Feb 2015 14:00:15 GMT
Of course, Stratio Deep and Stratio Cassandra are licensed  Apache 2.0.

Regarding the Cassandra support, I can introduce you to someone in Stratio
that can help you.

2015-02-12 15:05 GMT+01:00 Marcelo Valle (BLOOMBERG/ LONDON) <
mvallemilita@bloomberg.net>:

> Thanks for the hint Gaspar.
> Do you know if Stratio Deep / Stratio Cassandra are also licensed Apache
> 2.0?
>
> I had interest in knowing more about Stratio when I was working on a start
> up. Now, on a blueship, it seems one of the hardest obstacles to use
> Cassandra in a project is the need of an area supporting it, and it seems
> people are specially concerned about how many vendors an open source
> solution has to provide support.
>
> This seems to be kind of an advantage of HBase, as there are many vendors
> supporting it, but I wonder if Stratio can be considered an alternative to
> Datastax reggarding Cassandra support?
>
> It's not my call here to decide anything, but as part of the community it
> helps to have this business scenario clear. I could say Cassandra could be
> the best fit technical solution for some projects but sometimes
> non-technical factors are in the game, like this need for having more than
> one vendor available...
>
>
> From: gmunoz@stratio.com
> Subject: Re: best supported spark connector for Cassandra
>
> My suggestion is to use Java or Scala instead of Python. For Java/Scala
> both the Datastax and Stratio drivers are valid and similar options. As far
> as I know they both take care about data locality and are not based on the
> Hadoop interface. The advantage of Stratio Deep is that allows you to
> integrate Spark not only with Cassandra but with MongoDB, Elasticsearch,
> Aerospike and others as well.
> Stratio has a forked Cassandra for including some additional features such
> as Lucene based secondary indexes. So Stratio driver works fine with the
> Apache Cassandra and also with their fork.
>
> You can find some examples of using Deep here:
> https://github.com/Stratio/deep-examples  Please if you need some help
> with Stratio Deep do not hesitate to contact us.
>
>
> 2015-02-11 17:18 GMT+01:00 shahab <shahab.mokari@gmail.com>:
>
>> I am using Calliope cassandra-spark connector(
>> http://tuplejump.github.io/calliope/), which is quite handy and easy to
>> use!
>> The only problem is that it is a bit outdates , works with Spark 1.1.0,
>> hopefully new version comes soon.
>>
>> best,
>> /Shahab
>>
>> On Wed, Feb 11, 2015 at 2:51 PM, Marcelo Valle (BLOOMBERG/ LONDON) <
>> mvallemilita@bloomberg.net> wrote:
>>
>>> I just finished a scala course, nice exercise to check what I learned :D
>>>
>>> Thanks for the answer!
>>>
>>> From: user@cassandra.apache.org
>>> Subject: Re: best supported spark connector for Cassandra
>>>
>>> Start looking at the Spark/Cassandra connector here (in Scala):
>>> https://github.com/datastax/spark-cassandra-connector/tree/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector
>>>
>>> Data locality is provided by this method:
>>> https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/CassandraRDD.scala#L329-L336
>>>
>>> Start digging from this all the way down the code.
>>>
>>> As for Stratio Deep, I can't tell how the did the integration with
>>> Spark. Take some time to dig down their code to understand the logic.
>>>
>>>
>>>
>>> On Wed, Feb 11, 2015 at 2:25 PM, Marcelo Valle (BLOOMBERG/ LONDON) <
>>> mvallemilita@bloomberg.net> wrote:
>>>
>>>> Taking the opportunity Spark was being discussed in another thread, I
>>>> decided to start a new one as I have interest in using Spark + Cassandra
in
>>>> the feature.
>>>>
>>>> About 3 years ago, Spark was not an existing option and we tried to use
>>>> hadoop to process Cassandra data. My experience was horrible and we reached
>>>> the conclusion it was faster to develop an internal tool than insist on
>>>> Hadoop _for our specific case_.
>>>>
>>>> How I can see Spark is starting to be known as a "better hadoop" and it
>>>> seems market is going this way now. I can also see I have many more options
>>>> to decide how to integrate Cassandra using the Spark RDD concept than using
>>>> the ColumnFamilyInputFormat.
>>>>
>>>> I have found this java driver made by Datastax:
>>>> https://github.com/datastax/spark-cassandra-connector
>>>>
>>>> I also have found python Cassandra support on spark's repo, but it
>>>> seems experimental yet:
>>>> https://github.com/apache/spark/tree/master/examples/src/main/python
>>>>
>>>> Finally I have found stratio deep:
>>>> https://github.com/Stratio/deep-spark
>>>> It seems Stratio guys have forked Cassandra also, I am still a little
>>>> confused about it.
>>>>
>>>> Question: which driver should I use, if I want to use Java? And which
>>>> if I want to use python?
>>>> I think the way Spark can integrate to Cassandra makes all the
>>>> difference in the world, from my past experience, so I would like to know
>>>> more about it, but I don't even know which source code I should start
>>>> looking...
>>>> I would like to integrate using python and or C++, but I wonder if it
>>>> doesn't pay the way to use the java driver instead.
>>>>
>>>> Thanks in advance
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
>
> --
>
> Gaspar Muñoz
> @gmunozsoria
>
>
> <http://www.stratio.com/>
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>*
>
>
>


-- 

Gaspar Muñoz
@gmunozsoria


<http://www.stratio.com/>
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>*

Mime
View raw message