hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Shih (JIRA)" <>
Subject [jira] Commented: (HIVE-320) Issuing queries with COUNT(DISTINCT) on a column that may contain null values hits a NPE
Date Fri, 13 Mar 2009 17:19:54 GMT


Ryan Shih commented on HIVE-320:

Fixes problem on my on my test case also.

Spent the better part of yesterday trying to distill a test case for you guys. It seems to
happen only if there are at least two rows with the count(distinct) column null on both. Possibly
it only occurs for custom serdes because as I was trying to adapt our proprietary format to
something that I could post, it seemed that by using TextInputformat, the problem would go
away. Can't confirm this 100% though. Since I saw a patch had been submitted, I stopped working
on figuring it out.

> Issuing queries with COUNT(DISTINCT) on a column that may contain null values hits a
> ----------------------------------------------------------------------------------------
>                 Key: HIVE-320
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Ryan Shih
>            Assignee: Joydeep Sen Sarma
>            Priority: Blocker
>         Attachments: hive.320.1.patch
> When issuing a query that may contain a null value, I get a NPE. 
> E.g. if 'middle_name' potentially holds null values,
> select count(distinct middle_name) from people; will fail with the below exception.
> Other queries that work with the same input set:
> select distinct middle_name from people;
> select count(1), middle_name from people group by middle_name;
> org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
> 	at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(
> 	at
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
> 	at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(
> 	at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(
> 	... 2 more
> Caused by: java.lang.NullPointerException
> 	at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(
> 	at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(
> 	at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(
> 	... 3 more

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message