spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: FlatMapValues
Date Wed, 31 Dec 2014 15:46:07 GMT
>From the clarification below, the problem is that you are calling
flatMapValues, which is only available on an RDD of key-value tuples.
Your map function returns a tuple in one case but a String in the
other, so your RDD is a bunch of Any, which is not at all what you
want. You need to return a tuple in both cases, which is what Kapil
pointed out.

However it's still not quite what you want. Your input is basically
[key value1 value2 value3] so you want to flatMap that to (key,value1)
(key,value2) (key,value3). flatMapValues does not come into play.

On Wed, Dec 31, 2014 at 3:25 PM, Sanjay Subramanian
<sanjaysubramanian@yahoo.com> wrote:
> My understanding is as follows
>
> STEP 1 (This would create a pair RDD)
> =======
>
> reacRdd.map(line => line.split(',')).map(fields => {
>   if (fields.length >= 11 && !fields(0).contains("VAERS_ID")) {
>
> (fields(0),(fields(1)+"\t"+fields(3)+"\t"+fields(5)+"\t"+fields(7)+"\t"+fields(9)))
>   }
>   else {
>     ""
>   }
>   })
>
> STEP 2
> =======
> Since previous step created a pair RDD, I thought flatMapValues method will
> be applicable.
> But the code does not even compile saying that flatMapValues is not
> applicable to RDD :-(
>
>
> reacRdd.map(line => line.split(',')).map(fields => {
>   if (fields.length >= 11 && !fields(0).contains("VAERS_ID")) {
>
> (fields(0),(fields(1)+"\t"+fields(3)+"\t"+fields(5)+"\t"+fields(7)+"\t"+fields(9)))
>   }
>   else {
>     ""
>   }
>   }).flatMapValues(skus =>
> skus.split('\t')).saveAsTextFile("/data/vaers/msfx/reac/" + outFile)
>
>
> SUMMARY
> =======
> when a dataset looks like the following
>
> 1,red,blue,green
> 2,yellow,violet,pink
>
> I want to output the following and I am asking how do I do that ? Perhaps my
> code is 100% wrong. Please correct me and educate me :-)
>
> 1,red
> 1,blue
> 1,green
> 2,yellow
> 2,violet
> 2,pink

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message