impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 黄权隆 <huangquanl...@gmail.com>
Subject Question about the multi-thread scan node model
Date Wed, 30 Aug 2017 22:50:44 GMT
Hi all,


I’m working on applying our orc-support patch into the latest code bases (
IMPALA-5717 <https://issues.apache.org/jira/browse/IMPALA-5717>). Since our
patch is based on cdh-5.7.3-release which was released one year ago,
there’re lots of work to merge it.


One of the biggest changes from cdh-5.7.3-release I notice is the new scan
node & scanner model introduced in IMPALA-3902
<https://issues.apache.org/jira/browse/IMPALA-3902>. I think it’s inspired
by the investigating task in IMPALA-2849
<https://issues.apache.org/jira/browse/IMPALA-2849>, but I cannot find any
performance report in this issue. Could you share some report about this
multi-thread refactor?


I’m wondering how much this can improve the performance, since the old
single thread scan node & multi-thread scanners model has supplied
concurrent IO for reading, and most of the queries in OLAP are IO bound.


Thanks,

Quanlong

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message