cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Florian Hockmann" ...@florian-hockmann.de>
Subject Supporting CQL for Spark in JanusGraph
Date Wed, 16 Jan 2019 09:50:40 GMT
Hi,

JanusGraph currently supports Thrift and CQL to communicate with Cassandra,
but CQL not yet for OLAP jobs executed on Spark [1]. 

If you're not familiar with JanusGraph: JanusGraph[2] is a scalable graph
database that uses different storage and index backends to store the data
and support advanced index queries. Cassandra is one of these storage
backends (and probably the most used one).

So, I wanted to add support for a CQL Hadoop input format for JanusGraph
which basically wraps the CQL input format included cassandra-all.
Unfortunately, JanusGraph uses version 2.1.20 of that dependency which seems
to still rely on Thrift, even for the CQL input format. That of course kind
of defeats the purpose of implementing a CQL input format in JanusGraph that
doesn't rely on Thrift any more. There were also some problems with the
Hadoop input format in that version of cassandra-all and the version of the
DataStax Cassandra driver JanusGraph is using.

For these reasons, it seems necessary to update the cassandra-all dependency
to major version 3 in JanusGraph. This comes with 2 new problems however:

1.	There is no Hadoop input format any more for Thrift. It was removed
in CASSANDRA-9353 [3]. So, JanusGraph probably has to copy the Thrift code
from a pre-3 version of cassandra-all to continue supporting Thrift for some
time until we can completely move to CQL. (There are other options, but this
seems to be the best one for JanusGraph [4].)
2.	The code changed a lot with major version 3 which is understandable
of course, considering that the storage engine was largely rewritten. This
however breaks a lot of code in the Thrift adapter of JanusGraph which needs
to be updated accordingly.

Now, why I'm writing all this? I hope that you can confirm whether my
assertions / assumptions here are correct and whether this approach actually
makes sense or whether you would suggest another way forward.

Thanks,

Florian

[1]: https://github.com/JanusGraph/janusgraph/issues/985

[2]: http://janusgraph.org/

[3]: https://issues.apache.org/jira/browse/CASSANDRA-9353

[4]: https://groups.google.com/forum/#!topic/janusgraph-dev/7IU77lHwptw

 


Mime
View raw message