pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-4326) AvroStorageSchemaConversionUtilities does not properly convert schema for maps of arrays of records
Date Thu, 13 Nov 2014 23:16:34 GMT

    [ https://issues.apache.org/jira/browse/PIG-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211441#comment-14211441
] 

Daniel Dai commented on PIG-4326:
---------------------------------

It adds one more level to array:
parameters: map[array: (array: {innerRecord: (k: chararray,v: int)})

Should be:
parameters: map[array: {innerRecord: (k: chararray,v: int)}]}

The fix should look like:
{code}
      case RECORD:
      case MAP:
      case ARRAY:
        ResourceSchema innerResourceSchema =
            avroSchemaToResourceSchema(fieldSchema.getValueType(), schemasInStack,
            alreadyDefinedSchemas, allowRecursiveSchema);
        rf.setSchema(innerResourceSchema);
        break;
{code}

> AvroStorageSchemaConversionUtilities does not properly convert schema for maps of arrays
of records
> ---------------------------------------------------------------------------------------------------
>
>                 Key: PIG-4326
>                 URL: https://issues.apache.org/jira/browse/PIG-4326
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.12.0, 0.13.0
>            Reporter: Michael Prim
>         Attachments: mapsOfArraysOfRecords.patch
>
>
> I tried to convert the avro schema of a map of arrays of records into the proper pig
schema and got always empty map schemas in pig.
> The reason is that the AvroStorageSchemaConversionUtilities does only assume records
or primitive types as content of the map. However, a map of arrays, or a map of map, could
have a schema itself and requires recursive calling to derive the full schema.
> I wrote a unit test to test for maps of arrays of records which fails with every pig
release since the AvroStorage was rewritten (I think this was in 0.12), and there have been
no changes since then in the trunk. 
> Further the attached patch contains the (rather simple) fix that makes the schema conversion
utils succeed.
> Would appreciate further comments and if this can be included upstream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message