hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mitja Trampus (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-2620) LIKE incorrectly transforms expression to regex (does not escape "+" and possibly other special chars)
Date Thu, 01 Dec 2011 18:34:40 GMT

     [ https://issues.apache.org/jira/browse/HIVE-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mitja Trampus updated HIVE-2620:
--------------------------------

    Description: 
Whenever you have a LIKE expression that contains "|+" (the culprit) and "%" (so it gets converted
to regex), hive throws an exception that crashes the whole job.

Possibly related: https://issues.apache.org/jira/browse/HIVE-2594

{noformat}
hive> select 'foo |+18| bar' like 'foo |+18% bar' from akramer_one_row;
FAILED: Error in semantic analysis: Line 1:7 Wrong arguments ''foo |+18% bar'': org.apache.hadoop.hive.ql.metadata.HiveException:
Unable to execute method public org.apache.hadoop.io.BooleanWritable org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text)
 on object org.apache.hadoop.hive.ql.udf.UDFLike@292e2fba of class org.apache.hadoop.hive.ql.udf.UDFLike
with arguments {foo |+18| bar:org.apache.hadoop.io.Text, foo |+% bar:org.apache.hadoop.io.Text}
of size 2
{noformat}

Stack trace from the real world example with which I found this:
{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public
org.apache.hadoop.io.BooleanWritable org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text)
 on object org.apache.hadoop.hive.ql.udf.UDFLike@4a7baf7d of class org.apache.hadoop.hive.ql.udf.UDFLike
with arguments {ewt.arkadaslar pazartesinden sonra ozel escortlar sayfamızı zıyaret etcek
lutfn kaba dawranmıyalım escortlarımız resmlı olcak sız begenıceksınız escortunuzu
escortlarımı ıl ıl olacktır bılgnıze:org.apache.hadoop.io.Text, %çıtıR%kızLar%escort%kızLarı%burda%|+%18%|%:org.apache.hadoop.io.Text}
of size 2
	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:836)
	at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:180)
	at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163)
	at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:575)
	at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:767)
	at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:722)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765)
	at org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:129)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531)
	... 5 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:812)
	... 19 more
Caused by: java.util.regex.PatternSyntaxException: Dangling meta character '+' near index
42
.*çıtıR.*kızLar.*escort.*kızLarı.*burda.*|+.*18.*|.*
                                          ^
	at java.util.regex.Pattern.error(Pattern.java:1713)
	at java.util.regex.Pattern.sequence(Pattern.java:1878)
	at java.util.regex.Pattern.expr(Pattern.java:1752)
	at java.util.regex.Pattern.compile(Pattern.java:1460)
	at java.util.regex.Pattern.&lt;init&gt;(Pattern.java:1133)
	at java.util.regex.Pattern.compile(Pattern.java:823)
	at org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(UDFLike.java:186)
	... 23 more

{noformat}

  was:
Whenever you have a LIKE expression that contains "|+" (the culprit) and "%" (so it gets converted
to regex), hive throws an exception that crashes the whole job.

hive> select 'foo |+18| bar' like 'foo |+18% bar' from akramer_one_row;
FAILED: Error in semantic analysis: Line 1:7 Wrong arguments ''foo |+18% bar'': org.apache.hadoop.hive.ql.metadata.HiveException:
Unable to execute method public org.apache.hadoop.io.BooleanWritable org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text)
 on object org.apache.hadoop.hive.ql.udf.UDFLike@292e2fba of class org.apache.hadoop.hive.ql.udf.UDFLike
with arguments {foo |+18| bar:org.apache.hadoop.io.Text, foo |+% bar:org.apache.hadoop.io.Text}
of size 2

Stack trace from the real world example with which I found this:
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public
org.apache.hadoop.io.BooleanWritable org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text)
 on object org.apache.hadoop.hive.ql.udf.UDFLike@4a7baf7d of class org.apache.hadoop.hive.ql.udf.UDFLike
with arguments {ewt.arkadaslar pazartesinden sonra ozel escortlar sayfamızı zıyaret etcek
lutfn kaba dawranmıyalım escortlarımız resmlı olcak sız begenıceksınız escortunuzu
escortlarımı ıl ıl olacktır bılgnıze:org.apache.hadoop.io.Text, %çıtıR%kızLar%escort%kızLarı%burda%|+%18%|%:org.apache.hadoop.io.Text}
of size 2
	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:836)
	at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:180)
	at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163)
	at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:575)
	at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:767)
	at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:722)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765)
	at org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:129)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531)
	... 5 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:812)
	... 19 more
Caused by: java.util.regex.PatternSyntaxException: Dangling meta character '+' near index
42
.*çıtıR.*kızLar.*escort.*kızLarı.*burda.*|+.*18.*|.*
                                          ^
	at java.util.regex.Pattern.error(Pattern.java:1713)
	at java.util.regex.Pattern.sequence(Pattern.java:1878)
	at java.util.regex.Pattern.expr(Pattern.java:1752)
	at java.util.regex.Pattern.compile(Pattern.java:1460)
	at java.util.regex.Pattern.&lt;init&gt;(Pattern.java:1133)
	at java.util.regex.Pattern.compile(Pattern.java:823)
	at org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(UDFLike.java:186)
	... 23 more

    
> LIKE incorrectly transforms expression to regex (does not escape "+" and possibly other
special chars)
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-2620
>                 URL: https://issues.apache.org/jira/browse/HIVE-2620
>             Project: Hive
>          Issue Type: Bug
>          Components: UDF
>            Reporter: Mitja Trampus
>
> Whenever you have a LIKE expression that contains "|+" (the culprit) and "%" (so it gets
converted to regex), hive throws an exception that crashes the whole job.
> Possibly related: https://issues.apache.org/jira/browse/HIVE-2594
> {noformat}
> hive> select 'foo |+18| bar' like 'foo |+18% bar' from akramer_one_row;
> FAILED: Error in semantic analysis: Line 1:7 Wrong arguments ''foo |+18% bar'': org.apache.hadoop.hive.ql.metadata.HiveException:
Unable to execute method public org.apache.hadoop.io.BooleanWritable org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text)
 on object org.apache.hadoop.hive.ql.udf.UDFLike@292e2fba of class org.apache.hadoop.hive.ql.udf.UDFLike
with arguments {foo |+18| bar:org.apache.hadoop.io.Text, foo |+% bar:org.apache.hadoop.io.Text}
of size 2
> {noformat}
> Stack trace from the real world example with which I found this:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method
public org.apache.hadoop.io.BooleanWritable org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text)
 on object org.apache.hadoop.hive.ql.udf.UDFLike@4a7baf7d of class org.apache.hadoop.hive.ql.udf.UDFLike
with arguments {ewt.arkadaslar pazartesinden sonra ozel escortlar sayfamızı zıyaret etcek
lutfn kaba dawranmıyalım escortlarımız resmlı olcak sız begenıceksınız escortunuzu
escortlarımı ıl ıl olacktır bılgnıze:org.apache.hadoop.io.Text, %çıtıR%kızLar%escort%kızLarı%burda%|+%18%|%:org.apache.hadoop.io.Text}
of size 2
> 	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:836)
> 	at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:180)
> 	at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163)
> 	at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:575)
> 	at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:767)
> 	at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:722)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765)
> 	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765)
> 	at org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:129)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765)
> 	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531)
> 	... 5 more
> Caused by: java.lang.reflect.InvocationTargetException
> 	at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:812)
> 	... 19 more
> Caused by: java.util.regex.PatternSyntaxException: Dangling meta character '+' near index
42
> .*çıtıR.*kızLar.*escort.*kızLarı.*burda.*|+.*18.*|.*
>                                           ^
> 	at java.util.regex.Pattern.error(Pattern.java:1713)
> 	at java.util.regex.Pattern.sequence(Pattern.java:1878)
> 	at java.util.regex.Pattern.expr(Pattern.java:1752)
> 	at java.util.regex.Pattern.compile(Pattern.java:1460)
> 	at java.util.regex.Pattern.&lt;init&gt;(Pattern.java:1133)
> 	at java.util.regex.Pattern.compile(Pattern.java:823)
> 	at org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(UDFLike.java:186)
> 	... 23 more
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message