spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Feng Tian <>
Subject A TPCH benchmark for Spark
Date Thu, 27 Aug 2015 04:40:50 GMT

We released a package called LLQL, which is a serialization of operators of
relational algebra.  Spark SQL Plan is the first one supported.

More interesting to the spark community probably is our test that
implements TPCH.  We manually rewrote some sql -- mainly pulling subqueries
out and converted them into joins.   From the executor's point of view,
spark seems to work quite well.  However, there are several expression
parsing or algbraization issues, notably Q22, Q6, Q7, Q9.

Q2 will go OOM, and sometimes Q9 as well.   We are excited about Tungsten
project and looking forward to the 1.5 release.

We are running on Spark 1.4.0, prebuilt with Hadoop 2.6.

Links to the github and the tests,

Have fun with test and timing :-)


View raw message