I would not count on order preserving nature of the operations, because it is not guranteed. I would assign some order to the sentences and sort at the end before write back

Some operations like map, filter, flatMap and coalesce (with shuffle=false) usually preserve the order. However, sortBy, reduceBy, partitionBy, join etc. do not.

I read a test file using sparkContext.textfile(filename) and assign it to an RDD and process the RDD (replace some words) and finally write it to a text file using rdd.saveAsTextFile(output).
Is there any way to be sure the order of the sentences will not be changed? I need to have the same text with some corrected words.



