arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: [C++] Apply Gandiva Filter to a RecordBatch
Date Sat, 04 Apr 2020 23:26:56 GMT
You can see an example of filtering via the Python bindings

https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_gandiva.py#L89

This creates a gandiva::Filter using gandiva::Filter::Make, which can
be used to filter a RecordBatch

Is this what you need?

On Fri, Apr 3, 2020 at 7:12 PM Yue Ni <niyue.com@gmail.com> wrote:
>
> Hi there,
>
> I am using the gandiva C++ library for processing RecordBatch. I would like to know how
I can apply gandiva::Filter for a RecordBatch so that I can do some filtering without using
the projector.
>
> Since I don't find any documentation for it, I read some source code about its usage,
and here are the test cases I found about its usage:
> 1) https://github.com/apache/arrow/blob/967728fe4654e5d53bc0789e64e5a9ba7f27f263/cpp/src/gandiva/tests/filter_test.cc
> 2) https://github.com/apache/arrow/blob/967728fe4654e5d53bc0789e64e5a9ba7f27f263/cpp/src/gandiva/tests/filter_project_test.cc
>
> From my reading, I find it is possible to get a SelectionVector by using the gandiva::Filter,
at the same time, you can use the SelectionVector with the gandiva::Projector to filter RecordBatch
when doing projection. My questions are:
> 1) if I don't want to do any projection but simply filtering, what is the recommended
way to do it?
> 2) I am trying to handle the case like "SELECT * FROM table WHERE blah", is it recommended
to apply filtering without projection in this case or is there any alternative approach doing
it?
>
> Thanks.
>
> Regards,
> Yue
>

Mime
View raw message