impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Armstrong <>
Subject Re: Query compilation approach
Date Tue, 30 Jan 2018 18:47:56 GMT
Thanks for sharing.

I think we have some of the infrastructure required to do something similar
- we have an interpreted path already and we could swap in compiled
versions of functions by updating function pointers.

Michael Ho and I have talked about doing async codegen before or even
having a codegen service that could do codegen for all instances of the
fragment (LLVM's MCJit has the infrastructure to compile the bytecode to an
unlinked binary blob, then ship it somewhere else and do the final linking).

There are definitely logistical and resource management issues with a naive
approach - if you execute the interpreted path for a bit and then do
codegen synchronously or asynchronously, then for some short-running
queries you end up with the worse of both worlds as far as CPU consumption
goes - you waste CPU cycles on the slow interpreted path and on doing the
codegen. LLVM optimisation isn't generally cancellable so if you kick that
off for a large module, it's could be stuck running long past the end of
the query.

I think the paper deals with some of those issues - estimating time taken
and doing compilation in units of small granularity, which seem like good

On Tue, Jan 30, 2018 at 5:51 AM, Jim Apple <> wrote:

> This paper will apparently be in ICDE 2018. It demonstrates an
> approach to compiling queries more quickly, which it shows can be
> useful for queries that run very quickly or are very complex:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message