spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Narayanan K <knarayana...@gmail.com>
Subject Spark DAG Visualization for HiveQL
Date Mon, 21 Sep 2015 02:33:06 GMT
Hi

While running a Hive-SQL that joins 2 tables in Spark-SQL interface, the
DAG Visualization is as below :

For a Hive Table Scan, it has HadoopRDD, MapPartitionRDD, UnionRDD.
For Filter and project step, it has MapPartitionRDD.
I also see TungstenAggregate and TungstenExchange steps which also runs RDD.

Can some one throw some light on the meaning of these steps ?
How RDDs are created for Hive-QL ?
What does TungstenAggregate, TungstenProject,TungstenSort and
TungstenExchange mean?





Thanks in advance
Narayanan

Mime
View raw message