hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1014) Pig should convert COUNT(relation) to COUNT_STAR(relation) so that all records are counted without considering nullness of the fields in the records
Date Thu, 15 Oct 2009 14:12:31 GMT

    [ https://issues.apache.org/jira/browse/PIG-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766069#action_12766069
] 

Dmitriy V. Ryaboy commented on PIG-1014:
----------------------------------------

A link that talks about some of the more "interesting" behaviors of NULL in SQL: http://thoughts.j-davis.com/2009/08/02/what-is-the-deal-with-nulls/

The difference between COUNT and COUNT_STAR is that COUNT_STAR counts nulls. I think this
ticket boils down to the question, "what do we consider a null tuple?".  At the moment, we
consider A.$0 to determine whether the tuple is null; that doesn't seem right, and surprises
users. We have two options that both make sense -- a null tuple is a tuple in which all fields
are null, or a null tuple is a tuple which is completely null (ie, doesn't even have any fields).
 I am in favor of the first definition, which is a superset of the second.

> Pig should convert COUNT(relation) to COUNT_STAR(relation) so that all records are counted
without considering nullness of the fields in the records
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1014
>                 URL: https://issues.apache.org/jira/browse/PIG-1014
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Pradeep Kamath
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message