I am Kai. GSoC has announced selected projects last week. During community bonding period, I want to share some basics about this year's project with Apache Beam.
This project will be mentored by Kenneth Knowles. Many thanks to Kenn's mentorship in next three months. Also, Welcome any ideas and comments from you!
The project will mainly focus on implementing a TPC-DS benchmark on Beam SQL. We've seen many works have been tested on Spark, Hive and Pig, etc. It's interesting to see what happened if it builds onto Beam SQL. Presumably, the benchmark will test against on different runners (like, spark or flink). Based on the benchmark, a performance report will be generated eventually.
Proposal doc is here: (more details will be updated)
Once coding period starts on May 14, I will keep updating the status and progress of this project.