pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prashant Kommireddi (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2375) Incorrect outputSchema is invoked when overloading UDF in 0.9.1
Date Tue, 20 Dec 2011 06:35:30 GMT

    [ https://issues.apache.org/jira/browse/PIG-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172981#comment-13172981

Prashant Kommireddi commented on PIG-2375:

On having thought some more, when getArgToFuncMapping() is used the check is made based on
this to invoke an overriding EvalFunc. It does not really make a lot of sense to use outputSchema(Schema
inputSchema) to verify input schema once again.

The role of outputSchema (as the name suggests) should be to specify the output schema for
the UDF, and NOT necessarily to verify the input schema. Though for a new user or writer of
Pig UDFs this might not seem obvious when overriding UDFs.

To summarize: 

1. When UDF is not overriden, it is ok to use outputSchema to verify input schema.
2. When UDF is Overriden, it does not make sense to use outputSchema to verify input schema.
This is because getArgToFuncMapping already finds a matching spec based on the input schema.
> Incorrect outputSchema is invoked when overloading UDF in 0.9.1
> ---------------------------------------------------------------
>                 Key: PIG-2375
>                 URL: https://issues.apache.org/jira/browse/PIG-2375
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.1
>            Reporter: Prashant Kommireddi
>            Assignee: Prashant Kommireddi
>             Fix For: 0.9.1
>         Attachments: LogFieldValue.java, LogFieldValues.java
> When overloading a UDF with getArgToFuncMapping() the parent/root UDF outputSchema()
is being called. 
> {code}
>   @Override
>     public List<FuncSpec> getArgToFuncMapping() throws FrontendException {
>         List<FuncSpec> funcList = new ArrayList<FuncSpec>();
>         Schema s = new Schema();
>         s.add(new Schema.FieldSchema(null, DataType.TUPLE));
>         s.add(new Schema.FieldSchema(null, DataType.CHARARRAY));
>         funcList.add(new FuncSpec(this.getClass().getName(), s));
>         Schema s1 = new Schema();
>         s1.add(new Schema.FieldSchema(null, DataType.TUPLE));
>         s1.add(new Schema.FieldSchema(null, DataType.TUPLE));
>         funcList.add(new FuncSpec(LogFieldValues.class.getName(), s1));
>         return funcList;
>     }
> {code}
> In the above function, "LogFieldValues" is used when the input is (tuple, tuple) but
the outputSchema() is invoked from the root UDF.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message