arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravindra Pindikura <>
Subject Re: questions about Gandiva
Date Fri, 01 Nov 2019 13:32:56 GMT
On Thu, Oct 31, 2019 at 10:56 PM Wes McKinney <> wrote:

> hi
> On Thu, Oct 31, 2019 at 12:11 AM Yibo Cai <> wrote:
> >
> > Hi,
> >
> > Arrow cpp integrates Gandiva to provide low level operations on arrow
> buffers. [1][2]
> > I have some questions, any help is appreciated:
> > - Arrow cpp already has a compute kernel[3], does it duplicate what
> Gandiva provides? I see a Jira talk about it.[4]
> No. There are some cases of functional overlap but we are servicing a
> spectrum of use cases beyond the scope of Gandiva. Additionally, it is
> unclear to me that an LLVM JIT compilation step should be required to
> evaluate simple expressions such as "a > 5" -- in addition to
> introducing latency (due to the compilation step) it is also a heavy
> dependency to require the LLVM runtime in all applications.

Like other JIT based systems, gandiva take a hit at "build" time with the
hope that it can be amortized by faster "expression evaluate" times. This
works really well when we build an expression once, and evaluate thousands
or millions of record batches against the same built expression.

The build time is negligible (~1 or 2 ms) for simple expressions like the
one Wes gave here but it could be much higher with complex expressions
involving lots of if/else/case/in statements. We have seen very large
expressions (that includes 1000s of case statements) for while the build
time starts to show up as a significant factor, especially if it's used
evaluate only a few 100 batches or so.

Our approach to this is two fold :

1. cache built expressions : this is already done
- build expression once, cache it and reuse
- helps a lot for query reattempts

2. tiered compilation : not done yet
- the compilation time increases as we do more optimisation passes or we
try to inline functions more aggressively
- we could do this in a tiered fashion : eg.

   - tier 1: for the first M batches, use llvm's interpreter evaluation
   (build tier2 module in parallel)
   - tier 2 : for the next N batches, use gandiva-compiled module but with
   minimal optimisation passes (build tier3 in parallel)
   - tier 3: for rest, use gandiva-compiled and fully optimised module

That way, if a complex expression is used to evaluate just a few batches,
they don't pay the cost of the "fully optimised build time".

> Personally I'm interested in supporting a wide gamut of analytics
> workloads, from data frame / data science type libraries to SQL-like
> systems. Gandiva is designed for the needs of a SQL-based execution
> engine where chunks of data are fed into Projection or Filter nodes in
> a computation graph -- Gandiva generates a specialized kernel to
> perform a unit of work inside those nodes. Realistically, I expect
> many real world applications will contain a mixture of pre-compiled
> analytic kernels and JIT-compiled kernels.
> Rome wasn't built in a day, so I'm expecting several years of work
> ahead of us at the present rate. We need more help in this domain.
> > - Is Gandiva only for arrow cpp? What about other languages(go, rust,
> ...)?
> It's being used in Java via JNI. The same approach could be applied
> for the other languages as they have their own C FFI mechanisms.
> > - Gandiva leverages SIMD for vectorized operations[1], but I didn't see
> any related code. Am I missing something?
> My understanding is that LLVM inserts many SIMD instructions
> automatically based on the host CPU architecture version. Gandiva
> developers may have some comments / pointers about this

Wes is correct - we depend on the llvm optimisation passes to do this.

> >
> > [1]
> > [2]
> > [3]
> > [4]
> >
> > Thanks,
> > Yibo

Thanks and regards,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message