spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sung Hwan Chung <coded...@cs.stanford.edu>
Subject Is there a way to look at RDD's lineage? Or debug a fault-tolerance error?
Date Wed, 08 Oct 2014 19:01:57 GMT
My job is not being fault-tolerant (e.g., when there's a fetch failure or
something).

The lineage of RDDs are constantly updated every iteration. However, I
think that when there's a failure, the lineage information is not being
correctly reapplied.

It goes something like this:

val rawRDD = read(...)
val repartRDD = rawRDD.repartition(X)

val tx1 = repartRDD.map(...)
var tx2 = tx1.map(...)

while (...) {
  tx2 = tx1.zip(tx2).map(...)
}


Is there any way to monitor RDD's lineage, maybe even including? I want to
make sure that there's no unexpected things happening.

Mime
View raw message