cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomas Bartalos <>
Subject Cassandra and Apache Arrow
Date Wed, 09 Jan 2019 23:23:51 GMT
I’ve read lot of nice things about Apache Arrow in-memory columnar format. On their homepage
they mention Cassandra as a possible storage which could interoperate with Arrow. Unfortunately
I was not able to find any working example which would demonstrate their cooperation.

My use case: I’m doing OLAP processing of data stored in Cassandra with Spark. I need to
deduplicate data with Cassandra’s upserts, so other (more-suitable) storages like HDFS +
parquet, ORC didn’t seem like an option.
What I’d like to achieve: speed-up spark’s data ingestion from Cassandra. 

Is it possible to query data from Cassandra in Arrow format ?
View raw message