cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Johnson <>
Subject Code review - Spark SQL command-line client for Cassandra
Date Fri, 19 Jun 2015 09:21:17 GMT
Hi all,

I have been struggling with Cassandra’s lack of adhoc query support (I know
this is an anti-pattern of Cassandra, but sometimes management come over
and ask me to run stuff and it’s impossible to explain that it will take me
a while when it would take about 10 seconds in MySQL) so I have put
together the following code snippet that bundles DataStax’s Cassandra Spark
connector and allows you to submit Spark SQL to it, outputting the results
in a text file.

Does anyone spot any obvious flaws in this plan?? (I have a lot more error
handling etc in my code, but removed it here for brevity)

    *private* *void* run(String sqlQuery) {

        SparkContext scc = *new* SparkContext(conf);

        CassandraSQLContext csql = *new* CassandraSQLContext(scc);

        DataFrame sql = csql.sql(sqlQuery);

        String folderName = "/tmp/output_" + System.*currentTimeMillis*();

        *LOG*.info("Attempting to save SQL results in folder: " +


        *LOG*.info("SQL results saved");


    *public* *static* *void* main(String[] args) {

        String sparkMasterUrl = args[0];

        String sparkHost = args[1];

        String sqlQuery = args[2];

        SparkConf conf = *new* SparkConf();

        conf.setAppName("Java Spark SQL");


        conf.set("", sparkHost);

        JavaSparkSQL app = *new* JavaSparkSQL(conf);

, printToConsole);


I can then submit this to Spark with ‘spark-submit’:

Ø  *./spark-submit --class com.algomi.spark.JavaSparkSQL --master
spark://sales3:7077 sales3 "select * from mykeyspace.operationlog" *

It seems to work pretty well, so I’m pretty happy, but wondering why this
isn’t common practice (at least I haven’t been able to find much about it
on Google) – is there something terrible that I’m missing?



View raw message