drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5697) Improve performance of filter operator for pattern matching
Date Wed, 16 Aug 2017 17:06:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129087#comment-16129087
] 

ASF GitHub Bot commented on DRILL-5697:
---------------------------------------

GitHub user ppadma opened a pull request:

    https://github.com/apache/drill/pull/907

    DRILL-5697: Improve performance of filter operator for pattern matching

    Implement character by character matching for simple patterns and fallback to regex for
more complicated patterns. See DRILL-5697 for more details.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ppadma/drill DRILL-5697

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/907.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #907
    
----
commit f384475baea0ab05e84b3ac7b45118bb0b0ea5ce
Author: Padma Penumarthy <ppenumar97@yahoo.com>
Date:   2017-08-07T22:11:00Z

    DRILL-5697: Improve performance of filter operator for pattern matching

----


> Improve performance of filter operator for pattern matching
> -----------------------------------------------------------
>
>                 Key: DRILL-5697
>                 URL: https://issues.apache.org/jira/browse/DRILL-5697
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Flow
>    Affects Versions: 1.11.0
>            Reporter: Padma Penumarthy
>            Assignee: Padma Penumarthy
>
> Queries using filter with sql like operator use Java regex library for pattern matching.
However, for cases like %abc (ends with abc), abc% (starts with abc), %abc% (contains abc),
it is observed that implementing these cases with simple code instead of using regex library
provides good performance boost (4-6x). Idea is to use special case code for simple, common
cases and fall back to Java regex library for complicated ones. That will provide good performance
benefit for most common cases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message