cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomas Bartalos <tomas.barta...@gmail.com>
Subject Re: Cassandra and Apache Arrow
Date Wed, 09 Jan 2019 23:47:50 GMT
There is a diagram on the homepage displaying Cassandra (with other
storages) as source of data.
https://arrow.apache.org/img/shared.png

Which made me think there should be some integration...

On Thu, 10 Jan 2019, 12:38 am Jonathan Haddad <jon@jonhaddad.com wrote:

> Where are you seeing that it works with Cassandra?  There's no mention of
> it under https://arrow.apache.org/powered_by/, and on the homepage it
> says only says that a Cassandra developer worked on it.
>
> We (unfortunately) don't do anything with it at the moment.
>
> On Wed, Jan 9, 2019 at 3:24 PM Tomas Bartalos <tomas.bartalos@gmail.com>
> wrote:
>
>> I’ve read lot of nice things about Apache Arrow in-memory columnar
>> format. On their homepage they mention Cassandra as a possible storage
>> which could interoperate with Arrow. Unfortunately I was not able to find
>> any working example which would demonstrate their cooperation.
>>
>> *My use case:* I’m doing OLAP processing of data stored in Cassandra
>> with Spark. I need to deduplicate data with Cassandra’s upserts, so other
>> (more-suitable) storages like HDFS + parquet, ORC didn’t seem like an
>> option.
>> *What I’d like to achieve: *speed-up spark’s data ingestion from
>> Cassandra.
>>
>> Is it possible to query data from Cassandra in Arrow format ?
>>
>
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>

Mime
View raw message