hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Teddy Choi (JIRA)" <>
Subject [jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
Date Wed, 24 Jul 2013 15:15:51 GMT


Teddy Choi updated HIVE-4642:

    Attachment: Hive-Vectorized-Query-Execution-Design-rev10.docx

I wrote "LIKE and REGEXP expressions:" section in "Filter operator". Following is the added
Filter condition expressions

LIKE and REGEXP expressions:

LIKE and REGEXP expressions find any strings fitting a pattern. They compile a pattern on
creation, and find strings on evaluation.
Both kinds of expression use the Java regular expression package. REGEXP expressions use the
package as it is. But LIKE expressions have different grammar, so they need conversion. “%”
is converted to “.*” and “_” is converted to “.”. AbstractFilterStringColLikeStringScalar
class defines common behaviors. FilterStringColLikeStringScalar class and FilterStringColRegExpStringScalar
class implement differences.
There are simple and frequently used patterns; such as prefix match, suffix match, middle
match, exact match, and phone numbers. There are optimized implementations for them. They
evaluate using byte arrays directly to avoid UTF-8 decoding load.

This file is edited on Word for Mac 2011, so it may have incompatibilities.
> Implement vectorized RLIKE and REGEXP filter expressions
> --------------------------------------------------------
>                 Key: HIVE-4642
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Eric Hanson
>            Assignee: Teddy Choi
>         Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx
> See title. I will add more details next week. The goal is (a) make this work correctly
and (b) optimize it as well as possible, at least for the common cases.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message