spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mj <>
Subject Re: Appending an incrental value to each RDD record
Date Tue, 16 Dec 2014 16:01:45 GMT
You could try using zipWIthIndex (links below to API docs). For example, in

items =['a','b','c']
items2= sc.parallelize(items)

print(items2.first()) x: (x, x+"!"))



print(items4.first()) x: (x[1], x[0]))

This will give you an output of (0, ('a', 'a!')) - where the 0 is the index.
You could also use a map to increment them up by a value (e.g. if you wanted
to count from 1).


View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message