spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prabhu Joseph <prabhujose.ga...@gmail.com>
Subject Spark Thrift Server Concurrency
Date Thu, 23 Jun 2016 12:21:52 GMT
Hi All,

   On submitting 20 parallel same SQL query to Spark Thrift Server, the
query execution time for some queries are less than a second and some are
more than 2seconds. The Spark Thrift Server logs shows all 20 queries are
submitted at same time 16/06/23 12:12:01 but the result schema are at
different times.

16/06/23 12:12:01 INFO SparkExecuteStatementOperation: Running query
'select distinct val2 from philips1 where key>=1000 and key<=1500

16/06/23 12:12:*02* INFO SparkExecuteStatementOperation: Result Schema:
ArrayBuffer(val2#2110)
16/06/23 12:12:*03* INFO SparkExecuteStatementOperation: Result Schema:
ArrayBuffer(val2#2182)
16/06/23 12:12:*04* INFO SparkExecuteStatementOperation: Result Schema:
ArrayBuffer(val2#2344)
16/06/23 12:12:*05* INFO SparkExecuteStatementOperation: Result Schema:
ArrayBuffer(val2#2362)

There are sufficient executors running on YARN. The concurrency is affected
by Single Driver. How to improve the concurrency and what are the best
practices.

Thanks,
Prabhu Joseph

Mime
View raw message