cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Ruchovets <oruchov...@gmail.com>
Subject Re: cassandra + spark / pyspark
Date Thu, 11 Sep 2014 16:12:00 GMT
Thank you Rohit.
   I sent the email to you.

Thanks
Oleg.

On Thu, Sep 11, 2014 at 10:51 PM, Rohit Rai <rohit@tuplejump.com> wrote:

> Hi Oleg,
>
> I am the creator of Calliope. Calliope doesn't force any deployment
> model... that means you can run it with Mesos or Hadoop or Standalone. To
> be fair I don't think the other libs mentioned here should work too.
>
> The Spark cluster HA can be provided using ZooKeeper even in the
> standalone deployment mode.
>
>
> Can you explain what do you mean by "in memory aggregations" not being
> possible. With Calliope being able to utilize the secondary indexes and
> also our Stargate Indexes (Distributed lucene indexing for C*)  I am sure
> we can handle any scenario. Calliope is used in production at many large
> organizations over very very big data.
>
> Feel free to mail me directly, and we can work with you to get you started.
>
> Regards,
> Rohit
>
>
> *Founder & CEO, **Tuplejump, Inc.*
> ____________________________
> www.tuplejump.com
> *The Data Engineering Platform*
>
> On Thu, Sep 11, 2014 at 8:09 PM, Oleg Ruchovets <oruchovets@gmail.com>
> wrote:
>
>> Ok.
>>    DataStax , Startio are required mesos, hadoop yarn other third party
>> to get spark cluster HA.
>>
>> What in case of calliope?
>> Is it sufficient to have cassandra + calliope + spark to be able process
>> aggregations?
>> In my case we have quite a lot of data so doing aggregation only in
>> memory - impossible.
>>
>> Does calliope support not in memory mode for spark?
>>
>> Thanks
>> Oleg.
>>
>> On Thu, Sep 11, 2014 at 9:23 PM, abhinav chowdary <
>> abhinav.chowdary@gmail.com> wrote:
>>
>>> Adding to conversation...
>>>
>>> there are 3 great open source options available
>>>
>>> 1. Calliope http://tuplejump.github.io/calliope/
>>>     This is the first library that was out some time late last year (as
>>> i can recall) and I have been using this for a while, mostly very stable,
>>> uses Hadoop i/o in Cassandra (note that it doesn't require hadoop)
>>>
>>> 2. Datastax spark cassandra connector
>>> https://github.com/datastax/spark-cassandra-connector: Main difference
>>> is this uses cql3, again a great library but has few issues, also is very
>>> actively developed by far and still uses thrift for minor stuff but all
>>> heavy lifting in cql3
>>>
>>> 3. Startio Deep https://github.com/Stratio/stratio-deep: Has lot more
>>> to offer if you use all startio stack, Deep is for Spark, Statio Streaming
>>> is built on top of spark streaming, Stratio meta is something similar to
>>> sharkor sparksql and finally stratio Cassandra which is a fork of Cassandra
>>> with advanced Lucene based indexing
>>>
>>>
>>>
>>
>

Mime
View raw message