spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jon.g.massey" <jon.g.mas...@gmail.com>
Subject distributing Scala Map datatypes to RDD
Date Mon, 13 Oct 2014 21:02:07 GMT
Hi guys,
Just starting out with Spark and following through a few tutorials, it seems
the easiest way to get ones source data into an RDD is using the
sc.parallelize function. Unfortunately, my local data is in multiple
instances of Map<K,V> types, and the parallelize function only works on
objects with the Seq trait, and produces an RDD which seemingly doesn't then
have the notion of Keys and Values which I require for joins (amongst other
functions).

Is there a way of using a SparkContext to create a distributed RDD from a
local Map, rather than from a Hadoop or text file source?

Thanks,
Jon



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/distributing-Scala-Map-datatypes-to-RDD-tp16320.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message