pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Howard (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-5328) expressionOperator Divide.equalsZero(DataType.BIGDECIMAL) is invalid
Date Wed, 17 Jan 2018 21:14:00 GMT

    [ https://issues.apache.org/jira/browse/PIG-5328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329483#comment-16329483
] 

Michael Howard commented on PIG-5328:
-------------------------------------

Context/background:

BigDecimal is difficult to work with.

I did not see references in the pig documentation to how BigDecimal issues of precision, scale,
and RoundingMode are handled for the pig 'bigdecimal' data type. So I specifically went looking
at the Divide code to see how the pig implementation addressed these issues. I observed that
the divide operation is performed with a min of 6 digits of scale to the right of the decimal
point and RoundingMode.HALF_UP. (Arguably, RoundingMode.HALF_EVEN would have been the preferred
choice.)

I stumbled on the equalsZero() code and got somewhat nervous when I saw that it used BigDecimal.equals().
The behavior of BigDecimal.equals() is pretty fundamental to the understanding and proper
usage of the BigDecimal API. The fact that the equalsZero() code was written this way is a
strong indication that the implementor had very little exposure/experience with the BigDecimal
package. 

I am not familiar with the pig code, but I will try to help. 

> expressionOperator Divide.equalsZero(DataType.BIGDECIMAL) is invalid
> --------------------------------------------------------------------
>
>                 Key: PIG-5328
>                 URL: https://issues.apache.org/jira/browse/PIG-5328
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.16.0, 0.17.0
>         Environment: pig source HEAD as of Jan 2018 ... probably goes all the way back
to initial implementation of BigDecimal support
>            Reporter: Michael Howard
>            Priority: Major
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Divide.equalsZero(DataType.BIGDECIMAL) is flawed in that it uses an invalid test for ==
ZERO in the case of BigDecimal. 
>  
> ./physicalLayer/expressionOperators/Divide.java tests the divisor for zero in order to
avoid DivideByZero.
> The test is performed using a method equalsZero(...)
> Divide.equalsZero() is given 'protected' access, but I could not find other references
... should be 'private'
> equalsZero() implementation dispatches on dataType to type-specific predicates ... the
BigDecimal implementation is incorrect 
> The method BigDecimal.equals(other) is intended to be used for object equality, not numerical
equality. (Their justification is that equals() is used in hash-table lookups in java Collections.)
BigDecimal numbers are not normalized and scale is an important attribute. Scale is included
in BigDecimal.equals(). The values "0" and "0.00" have different scales and are not considered
"equals()"
> Comparisons for numeric equality need to be done using compareTo()
> In the special case of comparing to zero, BigDecimal.signum() is the best. 
> The current code is
> {{     case DataType.BIGDECIMAL:}}
> {{            return BigDecimal.ZERO.equals((BigDecimal) a);}}
> needs to be changed to
> {{     case DataType.BIGDECIMAL:}}
> {{            return ((BigDecimal) a).signum() == 0;}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message