spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chang Chen <baibaic...@gmail.com>
Subject Lineage between Datasets
Date Wed, 12 Apr 2017 10:03:41 GMT
Hi All

I believe that there is no lineage between datasets. Consider this case:

val people = spark.read.parquet("...").as[Person]

val ageGreatThan30 = people.filter("age > 30")

Since the second DS can push down the condition, they are obviously
different logical plans and hence are different physical plan.

What I understanding is right?

Thanks
Chang

Mime
View raw message