hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wojciech Langiewicz (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-1553) NPE when using complex string UDF
Date Thu, 19 Aug 2010 09:30:16 GMT

     [ https://issues.apache.org/jira/browse/HIVE-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wojciech Langiewicz updated HIVE-1553:
--------------------------------------

    Description: 
When executing this query: {code}select explode(split(city, "")) as char from users;{code}
I get NPE: {code}java.lang.NullPointerException
	at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode.process(GenericUDTFExplode.java:70)
	at org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:98)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:81)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:43)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:347)
	at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
	at org.apache.hadoop.mapred.Child.main(Child.java:170){code}
But in case of this query:{code}select explode(split(city, "")) as char from users where id
= 234234;{code} NPE does not occur, but in case of this query: {code}select explode(split(city,
"")) as char from users where id > 0;{code}  Some mappers succed, but most of them fails,
so whole task fails.
city is a string column and maximum users.id is about 30M.

I have run another query:{code}select explode(split(city, "")) as char from users where city
is not null;{code}
and now the error I get is:{code}org.apache.hadoop.hive.ql.metadata.HiveException: UDTF's
should not output rows on close
	at org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:111)
	at org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:40)
	at org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:81)
	at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode.process(GenericUDTFExplode.java:72)
	at org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:98)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:81)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:73)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:73)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:43)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:347)
	at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
	at org.apache.hadoop.mapred.Child.main(Child.java:170){code}

  was:
When executing this query: {code}select explode(split(city, "")) as char from users;{code}
I get NPE: {code}java.lang.NullPointerException
	at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode.process(GenericUDTFExplode.java:70)
	at org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:98)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:81)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:43)
	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:347)
	at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
	at org.apache.hadoop.mapred.Child.main(Child.java:170){code}
But in case of this query:{code}select explode(split(city, "")) as char from users where id
= 234234;{code} NPE does not occur, but in case of this query: {code}select explode(split(city,
"")) as char from users where id > 0;{code}  Some mappers succed, but most of them fails,
so whole task fails.
city is a string column and maximum users.id is about 30M.


added another test query with error

> NPE when using complex string UDF
> ---------------------------------
>
>                 Key: HIVE-1553
>                 URL: https://issues.apache.org/jira/browse/HIVE-1553
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: UDF
>    Affects Versions: 0.5.0
>         Environment: CDH3B2 version on debian
>            Reporter: Wojciech Langiewicz
>
> When executing this query: {code}select explode(split(city, "")) as char from users;{code}
I get NPE: {code}java.lang.NullPointerException
> 	at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode.process(GenericUDTFExplode.java:70)
> 	at org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:98)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
> 	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:81)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
> 	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:43)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
> 	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:347)
> 	at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> 	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:170){code}
> But in case of this query:{code}select explode(split(city, "")) as char from users where
id = 234234;{code} NPE does not occur, but in case of this query: {code}select explode(split(city,
"")) as char from users where id > 0;{code}  Some mappers succed, but most of them fails,
so whole task fails.
> city is a string column and maximum users.id is about 30M.
> I have run another query:{code}select explode(split(city, "")) as char from users where
city is not null;{code}
> and now the error I get is:{code}org.apache.hadoop.hive.ql.metadata.HiveException: UDTF's
should not output rows on close
> 	at org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:111)
> 	at org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:40)
> 	at org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:81)
> 	at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode.process(GenericUDTFExplode.java:72)
> 	at org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:98)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
> 	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:81)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
> 	at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:73)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
> 	at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:73)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
> 	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:43)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
> 	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:347)
> 	at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> 	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:170){code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message