datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Hayes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DATAFU-34) Add some UDFS to handle map type
Date Tue, 29 Apr 2014 23:48:16 GMT

    [ https://issues.apache.org/jira/browse/DATAFU-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984970#comment-13984970
] 

Matthew Hayes commented on DATAFU-34:
-------------------------------------

Regarding #1, I think it will be hard to use this UDF if there is not a well defined schema.
 You won't be able to reference the fields by name.  The problem is that we don't know what's
going to be in the map, so the size of the output tuple can vary for each map.  We could fix
this by having the user pass in some information about what's in the map so we can generate
the schema.  But, this implies that they know something about what's in the map, for example
what all the expected keys are.  If this is the case then they don't really need this UDF
because they could construct a UDF on the fly like so:  ('a',my_map#'a', 'b', my_map#'b').
 So, I'm not sure how we can make MapToTuple really work.

Regarding #2, I'm not sure how to test the bytearray case.  Maybe Pig's DataType has a helper
method to convert to a bytearray.  Another option is to test the UDF through a Pig script
and declare the type of the input as bytearray when you define the input schema.



> Add some UDFS to handle map type
> --------------------------------
>
>                 Key: DATAFU-34
>                 URL: https://issues.apache.org/jira/browse/DATAFU-34
>             Project: DataFu
>          Issue Type: New Feature
>            Reporter: jian wang
>            Assignee: jian wang
>         Attachments: 0001-add-some-UDFs-to-manipulate-map.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message