From user-return-369-archive-asf-public=cust-asf.ponee.io@arrow.apache.org Sat Apr 4 23:27:36 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id D876B180660 for ; Sun, 5 Apr 2020 01:27:35 +0200 (CEST) Received: (qmail 73247 invoked by uid 500); 4 Apr 2020 23:27:35 -0000 Mailing-List: contact user-help@arrow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@arrow.apache.org Delivered-To: mailing list user@arrow.apache.org Received: (qmail 73237 invoked by uid 99); 4 Apr 2020 23:27:35 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Apr 2020 23:27:35 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 77BD5C02EF for ; Sat, 4 Apr 2020 23:27:34 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.201 X-Spam-Level: X-Spam-Status: No, score=-0.201 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id EisBYGP9MhLG for ; Sat, 4 Apr 2020 23:27:33 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.166.194; helo=mail-il1-f194.google.com; envelope-from=wesmckinn@gmail.com; receiver= Received: from mail-il1-f194.google.com (mail-il1-f194.google.com [209.85.166.194]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 2E114BB87E for ; Sat, 4 Apr 2020 23:27:33 +0000 (UTC) Received: by mail-il1-f194.google.com with SMTP id i75so11112673ild.13 for ; Sat, 04 Apr 2020 16:27:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=Wm71DOsNqWWOk7EZ6Ht8v1SBONbGSZrBMjMsSxpNq+U=; b=IegNmOS5xKx5T9ggBU8CzEwUnxKqYOvBoi5m1F5/PkdBVuOJMkn5nYre95WVubp5oE MKclUXFxnkCX8SatWcC6KXLMox7nSQvETvY24mEOVrIMMDdvWC81I4DTDODUb/t167GA 4qMco9rEUE5isHsF7fnrFxsP+ZFvoWzJ/UeGZ5LcDyR/nnoXtHTkb80Ico0LPX6T3Gsz ZOC4QY5TdPb1pDl/hwFJc7t/lg/4JvkwlImIchpvDYh5Nzb4XU6IR/O7E99NidHUIbpd KPleKHNaKrfnVpKEMvc4eQhEmHWOuZ9OeBAMe352A7ivNFzu8p89CE4rgOEv0QKW0ruI gCog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=Wm71DOsNqWWOk7EZ6Ht8v1SBONbGSZrBMjMsSxpNq+U=; b=MOn8zICgrgSr6vQrMbQvmcFqlTZ1nTXhKJc6HEyKp4q9e8BQQId+5reQtp8pkb4rRJ wt+kSA2ZynKo9Cr7YHKT1rs4uIymw3JZkCVJuuogQ2z7NhzKtSYK+pIJeKgsmdRBlJb0 +sl7IUdi6+YLxVbQIhP1Sj0aZlVJdpX1xvF1L+t0r7NBVzg0d+hXDb3WjMiaHyi+DDj8 LLYA93RzMbd98GUqtGXhFTqWJV22aKohnhcT56vIUoWZdDTxJQ44KFtjrVQWd36586FG S1IpucHggS+/127X4ntOFim/4uqZRNadeP8uSseSzv7npziDe0Xf7edn7kpoju+It/ns V1Mw== X-Gm-Message-State: AGi0PuZNiGk64kGeX7w1Y+C/MbBfLn+TRfh1XZL0a69RyvrLeG/fYtbU 8qp5Q0zqRrugGaB89EIaY4Ac3qE/f1iZTc5Rbpu2no2HTwU= X-Google-Smtp-Source: APiQypLO9YgG8wDCG0OXMis1cCFM/+V2vPuesgYLyYq81glzDc7Kk95lX01XkDQO5T6Ht8DYk+P44QnAzs1Puc/TN6c= X-Received: by 2002:a92:8642:: with SMTP id g63mr3543861ild.281.1586042852393; Sat, 04 Apr 2020 16:27:32 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Wes McKinney Date: Sat, 4 Apr 2020 18:26:56 -0500 Message-ID: Subject: Re: [C++] Apply Gandiva Filter to a RecordBatch To: user@arrow.apache.org Content-Type: text/plain; charset="UTF-8" You can see an example of filtering via the Python bindings https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_gandiva.py#L89 This creates a gandiva::Filter using gandiva::Filter::Make, which can be used to filter a RecordBatch Is this what you need? On Fri, Apr 3, 2020 at 7:12 PM Yue Ni wrote: > > Hi there, > > I am using the gandiva C++ library for processing RecordBatch. I would like to know how I can apply gandiva::Filter for a RecordBatch so that I can do some filtering without using the projector. > > Since I don't find any documentation for it, I read some source code about its usage, and here are the test cases I found about its usage: > 1) https://github.com/apache/arrow/blob/967728fe4654e5d53bc0789e64e5a9ba7f27f263/cpp/src/gandiva/tests/filter_test.cc > 2) https://github.com/apache/arrow/blob/967728fe4654e5d53bc0789e64e5a9ba7f27f263/cpp/src/gandiva/tests/filter_project_test.cc > > From my reading, I find it is possible to get a SelectionVector by using the gandiva::Filter, at the same time, you can use the SelectionVector with the gandiva::Projector to filter RecordBatch when doing projection. My questions are: > 1) if I don't want to do any projection but simply filtering, what is the recommended way to do it? > 2) I am trying to handle the case like "SELECT * FROM table WHERE blah", is it recommended to apply filtering without projection in this case or is there any alternative approach doing it? > > Thanks. > > Regards, > Yue >