hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Teddy Choi (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
Date Mon, 01 Jul 2013 04:16:20 GMT

     [ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Teddy Choi updated HIVE-4642:
-----------------------------

    Attachment: HIVE-4642-1.patch

I wrote draft code. It needs more comments, tests, and refactoring.

I agree that FA generation will be a heavy job, so I didn't implemented it. Common phone number
patterns are covered with a simple fixed automaton. I will add more simple automata.

There are already hard coded decisions, and more will come. So I introduced an interface that
generalizes decisions. It may reduce performance little bit.
----
Class hierarchy:

AbstractFilterStringColLikeStringScalar
+ FilterStringColLikeStringScalar
+ FilterStringColRegExpStringScalar

AbstractFilter...#Checker
+ AbstractFilter...#BeginChecker
+ AbstractFilter...#EndChecker
+ AbstractFilter...#MiddleChecker
+ AbstractFilter...#NoneChecker
+ AbstractFilter...#AnyCharChecker
+ AbstractFilter...#ComplexChecker
+ FilterStringColRegExpStringScalar#PhoneNumberChecker

AbstractFilter...#CheckerFactory
+ Filter...Like...#LikeBeginCheckerFactory
+ Filter...Like...#LikeEndCheckerFactory
+ Filter...Like...#LikeMiddleCheckerFactory
+ Filter...Like...#LikeNoneCheckerFactory
+ Filter...Like...#LikeAnyCharCheckerFactory
+ Filter...Like...#LikeComplexCheckerFactory
+ Filter...RegExp...#RegExpBeginCheckerFactory
+ Filter...RegExp...#RegExpEndCheckerFactory
+ Filter...RegExp...#RegExpMiddleCheckerFactory
+ Filter...RegExp...#RegExpNoneCheckerFactory
+ Filter...RegExp...#RegExpAnyCharCheckerFactory
+ Filter...RegExp...#RegExpComplexCheckerFactory
+ Filter...RegExp...#RegExpPhoneNumberCheckerFactory
                
> Implement vectorized RLIKE and REGEXP filter expressions
> --------------------------------------------------------
>
>                 Key: HIVE-4642
>                 URL: https://issues.apache.org/jira/browse/HIVE-4642
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Eric Hanson
>            Assignee: Teddy Choi
>         Attachments: HIVE-4642-1.patch
>
>
> See title. I will add more details next week. The goal is (a) make this work correctly
and (b) optimize it as well as possible, at least for the common cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message