cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajesh Radhakrishnan <Rajesh.Radhakrish...@phe.gov.uk>
Subject UUID coming as int while using SPARK SQL
Date Tue, 24 May 2016 10:23:13 GMT
Hi,


I got a Cassandra keyspace, but while reading the data(especially UUID) via Spark SQL using
Python is not returning the correct value.

Cassandra:
--------------
My table 'SAM'' is described below:

CREATE table ks.sam (id uuid, dept text, workflow text, type double primary  key (id, dept))

SELECT id, workflow FROM sam WHERE dept='blah';

The above example  CQL gives me the following
id                                   | workflow
--------------------------------------+------------
 9547v26c-f528-12e5-da8b-001a4q3dac10 |       testWK


Spark/Python:
------------------
from pyspark import SparkConf
from pyspark.sql import SQLContext
import pyspark_cassandra
from pyspark_cassandra import CassandraSparkContext

....
conf = SparkConf().set("spark.cassandra.connection.host",IP_ADDRESS).set("spark.cassandra.connection.native.port",PORT_NUMBER)
sparkContext = CassandraSparkContext(conf = conf)
sqlContext = SQLContext(sparkContext)

samTable =sparkContext.cassandraTable("ks", "sam").select('id', 'dept','workflow')
samTable.cache()

samdf.registerTempTable("samd")

 sparkSQLl ="SELECT distinct id, dept, workflow FROM samd WHERE workflow='testWK'
 new_df = sqlContext.sql(sparkSQLl)
 results  =  new_df.collect()
 for row in results:
            print "dept=",row.dept
            print "wk=",row.workflow
            print "id=",row.id
...
The Python code above prints the following:
dept=Biology
wk=testWK
id=293946894141093607334963674332192894528


You can see here that the id (uuid) whose correct value at Cassandra is ' 9547v26c-f528-12e5-da8b-001a4q3dac10'
 but via Spark I am getting an int '29394689414109360733496367433219289452'.
What I am doing wrong here? How to get the correct UUID value from Cassandra via Spark/Python
? Please help me.

Thank you
Rajesh R

**************************************************************************
The information contained in the EMail and any attachments is confidential and intended solely
and for the attention and use of the named addressee(s). It may not be disclosed to any other
person without the express authority of Public Health England, or the intended recipient,
or both. If you are not the intended recipient, you must not disclose, copy, distribute or
retain this message or any part of it. This footnote also confirms that this EMail has been
swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening
or saving. http://www.gov.uk/PHE
**************************************************************************
Mime
View raw message