pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jie Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2691) Duplicate TOKENIZE schema
Date Thu, 24 May 2012 02:11:41 GMT

    [ https://issues.apache.org/jira/browse/PIG-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282128#comment-13282128
] 

Jie Li commented on PIG-2691:
-----------------------------

As there was no documentation on the field schema of TOKENIZE, can we assume that if users
want to use the field name, she would explicitly name it by AS? If so, then this change wouldn't
break the script.
                
> Duplicate TOKENIZE schema
> -------------------------
>
>                 Key: PIG-2691
>                 URL: https://issues.apache.org/jira/browse/PIG-2691
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Jie Li
>              Labels: simple
>         Attachments: PIG-2691.patch, PIG-2691.patch.2
>
>
> TOKENIZE produces a fixed named schema that results in duplicates if used more than once
in the same generate statement.
> We could paramenterize the schema on the name of the field being tokenized.
> {code}
> grunt> q = LOAD 'file' AS (source:chararray, target:chararray);
> grunt> e = FOREACH q GENERATE TOKENIZE(source), TOKENIZE(target);
> 2012-05-09 20:18:37,235 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1108: 
> <line 2, column 14> Duplicate schema alias: bag_of_tokenTuples
> grunt> e = FOREACH q GENERATE TOKENIZE(source) as s_entities, TOKENIZE(target) as
t_entities;
> grunt> describe e
> e: {s_entities: {tuple_of_tokens: (token: chararray)},t_entities: {tuple_of_tokens: (token:
chararray)}}
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message