hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gianmarco De Francisci Morales (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-566) Dump and store outputs do not match for PigStorage
Date Mon, 10 May 2010 23:06:30 GMT

    [ https://issues.apache.org/jira/browse/PIG-566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865971#action_12865971

Gianmarco De Francisci Morales commented on PIG-566:

It still fails also when manually run because there is a bug in UTF8StorageConverter.
This patch just made it pop out.

The problem is that bytesToInteger relied on the final L to recognize longs and avoid converting

More specifically, using Double d = Double.valueOf(s);
The method checks if the integer is convertible to a double and the double is actually a int
by hand.
The check is performed only on the upper bound, so it would fail if we passed a double that
is less than Integer.MIN_VALUE

I actually also do not understand why the check was done in this way:
d.doubleValue() > mMaxInt.doubleValue() + 1.0

I refactored it into :
Double.compare(d.doubleValue(), mMaxInt.doubleValue()) > 0

I added some unit tests for these edge cases.

I adapted also bytesToLong for consistency, even though the bug is not as evident because
of rounding (MAX_LONG as a double is smaller than as a long).

The new patch passes all tests locally.

> Dump and store outputs do not match for PigStorage
> --------------------------------------------------
>                 Key: PIG-566
>                 URL: https://issues.apache.org/jira/browse/PIG-566
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.7.0, 0.8.0
>            Reporter: Santhosh Srinivasan
>            Assignee: Gianmarco De Francisci Morales
>            Priority: Minor
>             Fix For: 0.7.0, 0.8.0
>         Attachments: PIG-566.patch, PIG-566.patch, PIG-566.patch
> The dump and store formats for PigStorage do not match for longs and floats.
> {code}
> grunt> y = foreach x generate {(2985671202194220139L)};
> grunt> describe y;
> y: {{(long)}}
> grunt> dump y;
> ({(2985671202194220139L)})
> grunt> store y into 'y';
> grunt> cat y
> {(2985671202194220139)}
> {code}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message