spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apurva Nandan <>
Subject RDD generated from Dataframes
Date Thu, 21 Apr 2016 13:49:39 GMT
Hello everyone,

Generally speaking, I guess it's well known that dataframes are much faster
than RDD when it comes to performance.
My question is how do you go around when it comes to transforming a
dataframe using map.
I mean then the dataframe gets converted into RDD, hence now do you again
convert this RDD to a new dataframe for better performance?
Further, if you have a process which involves series of transformations
i.e. from one RDD to another, do you keep on converting each RDD to a
dataframe first, all the time?

It's also possible that I might be missing something here, please share
your experiences.

Thanks and Regards,

View raw message