hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1085) Pass JobConf and UDF specific configuration information to UDFs
Date Mon, 16 Nov 2009 22:17:39 GMT

    [ https://issues.apache.org/jira/browse/PIG-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778599#action_12778599

Alan Gates commented on PIG-1085:

Documentation for this patch:

The singleton UDFContext class provides two features to UDF writers.  First, on the backend,
it allows UDFs to get access to the JobConf object, by calling getJobConf.  This is only available
on the backend (at run time) as the JobConf has not yet been constructed on the front end
(during planning time).

Second, it allows UDFs to pass configuration information between instantiations of the UDF
on the front and backends.  UDFs can store information in a configuration object when they
are constructed on the front end, or during other front end calls such as describeSchema.
 They can then read that information on the backend when exec (for EvalFunc) or getNext (for
LoadFunc) is called.  Note that information will not be passed bewteen instantiations of the
function on the backend.  The communication channel only works from front end to back end.

To store information, the UDF calls getUDFProperties.  This returns a Properties object which
the UDF can record the information in or read the information from.  To avoid name space conflicts
UDFs are required to provide a signature when obtaining a Properties object.  This can be
done in two ways.  The UDF can provide its Class object (via this.getClass()).  In this case,
every instantiation of the UDF will be given the same Properties object.  The UDF can also
provide its Class plus an array of Strings.  The UDF can pass its constructor arguments, or
some other identifying strings.  This allows each instantiation of the UDF to have a different
properties object thus avoiding name space collisions between instantiations of the UDF.

> Pass JobConf and UDF specific configuration information to UDFs
> ---------------------------------------------------------------
>                 Key: PIG-1085
>                 URL: https://issues.apache.org/jira/browse/PIG-1085
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: udfconf-2.patch, udfconf.patch
> Users have long asked for a way to get the JobConf structure in their UDFs.  It would
also be nice to have a way to pass properties between the front end and back end so that UDFs
can store state during parse time and use it at runtime.
> This patch does part of what is proposed in PIG-602, but not all of it.  It does not
provide a way to give user specified configuration files to UDFs.  So I will mark 602 as depending
on this bug, but it isn't a duplicate.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message