spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernard Jesop <bernard.je...@gmail.com>
Subject Dataset API Question
Date Wed, 25 Oct 2017 13:51:17 GMT
Hello everyone,

I have a question about checkpointing on dataset.

It seems in 2.1.0 that there is a Dataset.checkpoint(), however unlike RDD
there is no Dataset.isCheckpointed().

I wonder if Dataset.checkpoint is a syntactic sugar for
Dataset.rdd.checkpoint.
When I do :

Dataset.checkpoint; Dataset.count
Dataset.rdd.isCheckpointed // result: false

However, when I explicitly do:
Dataset.rdd.checkpoint; Dataset.rdd.count
Dataset.rdd.isCheckpointed // result: true

Could someone explain this behavior to me, or provide some references?

Best regards,
Bernard

Mime
View raw message