flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-9990) Add regex_extract supported in TableAPI and SQL
Date Sat, 04 Aug 2018 04:00:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-9990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569039#comment-16569039
] 

ASF GitHub Bot commented on FLINK-9990:
---------------------------------------

yanghua commented on a change in pull request #6448: [FLINK-9990] [table] Add regex_extract
supported in TableAPI and SQL
URL: https://github.com/apache/flink/pull/6448#discussion_r207699091
 
 

 ##########
 File path: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/ScalarFunctionsTest.scala
 ##########
 @@ -450,6 +450,40 @@ class ScalarFunctionsTest extends ScalarTypesTestBase {
       "1111111111111111111111111111111111111111111111111111111111111111")
   }
 
+  @Test
+  def testRegexExtract(): Unit = {
 
 Review comment:
   Good point, here is a problem, I wrote this case to test  : 
   
   ```scala
   testAllApis(
     "foothebar".regexExtract("foo([\\w]+)", 1),                 //OK, the method got 'foo([\w]+)'
     "'foothebar'.regexExtract('foo([\\\\w]+)', 1)",              //failed, the method got
'foo([\\w]+)' returns "", but if pass 'foo([\\w]+)' would get compile error.
     "REGEX_EXTRACT('foothebar', 'foo([\\\\w]+)', 1)",        //OK, the method got 'foo([\w]+)'
but must pass four '\'
     "thebar"
   )
   ```
   
   It seems flink pre-process the regex which contains `\xxx`. A few days ago, we also met
this issue when test `similar to` to match the regex which contains `\d`.
   
   cc @twalthr 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Add regex_extract supported in TableAPI and SQL
> -----------------------------------------------
>
>                 Key: FLINK-9990
>                 URL: https://issues.apache.org/jira/browse/FLINK-9990
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API &amp; SQL
>            Reporter: vinoyang
>            Assignee: vinoyang
>            Priority: Minor
>              Labels: pull-request-available
>
> regex_extract is a very useful function, it returns a string based on a regex pattern
and a index.
> For example : 
> {code:java}
> regexp_extract('foothebar', 'foo(.*?)(bar)', 2) // returns 'bar.'
> {code}
> It is provided as a UDF in Hive, more details please see[1].
> [1]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message