pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Woody Anderson (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (PIG-1942) script UDF (jython) should utilize the intended output schema to more directly convert Py objects to Pig objects
Date Wed, 04 May 2011 00:22:03 GMT

     [ https://issues.apache.org/jira/browse/PIG-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Woody Anderson reassigned PIG-1942:
-----------------------------------

    Assignee: Woody Anderson

> script UDF (jython) should utilize the intended output schema to more directly convert
Py objects to Pig objects
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1942
>                 URL: https://issues.apache.org/jira/browse/PIG-1942
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Woody Anderson
>            Assignee: Woody Anderson
>            Priority: Minor
>              Labels: python, schema, udf
>             Fix For: 0.10
>
>         Attachments: 1942.patch, 1942_with_junit.patch
>
>
> from https://issues.apache.org/jira/browse/PIG-1824
> {code}
> import re
> @outputSchema("y:bag{t:tuple(word:chararray)}")
> def strsplittobag(content,regex):
>         return re.compile(regex).split(content)
> {code}
> does not work because split returns a list of strings. However, the output schema is
known, and it would be quite simple to implicitly promote the string element to a tupled element.
> also, a list/array/tuple/set etc. are all equally convertable to bag, and list/array/tuple
are equally convertable to Tuple, this conversion can be done in a much less rigid way with
the use of the schema.
> this allows much more facile re-use of existing python code and less memory overhead
to create intermediate re-converting of object types.
> I have written the code to do this a while back as part of my version of the jython script
framework, i'll isolate that and attach.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message