hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1016) Reading in map data seems broken
Date Tue, 24 Nov 2009 01:16:40 GMT

    [ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781727#action_12781727
] 

Thejas M Nair commented on PIG-1016:
------------------------------------

I agree with hc busy that PigStorage in current state is broken.  
It does not support storing complex datatypes as map-values. But the problem with the approach
before PIG-880 is that it has issues like what Santhosh mentioned -
bq. Suppose, the records is 'key'#1234567890124567. PIG-880 would treat the value as a string
and there would be no problem. Now, with the changes reverted, the type is inferred as integer
and the parsing will fail as the value is too big to fit into an integer

This problem arises because strings can have arbitrary values and can resemble other types.
 This ambiguity in identifying types can be fixed if we require strings to be quoted in the
file.  
I propose creating a new load/storeFunc -  PigStorage2  and require strings to be quoted in
that, and apply the changes that hc busy proposed in this patch. This could be done in PIG-1083.
I am not sure if we should change PigStorage to pre PIG-880 .

comments ?


> Reading in map data seems broken
> --------------------------------
>
>                 Key: PIG-1016
>                 URL: https://issues.apache.org/jira/browse/PIG-1016
>             Project: Pig
>          Issue Type: Improvement
>          Components: data
>    Affects Versions: 0.4.0
>            Reporter: hc busy
>             Fix For: 0.5.0
>
>         Attachments: PIG-1016.patch
>
>
> Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because
of a misconfiguration in the parser. Where as in almost all documentation it is stated that
value of the map can be any time.
> I've attached a patch that allows us to read in complex objects as value as documented.
I've done simple verification of loading in maps with tuple/map values and writing them back
out using LOAD and STORE. All seems to work fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message