hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pradeep Kamath (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-653) Make fieldsToRead work in loader
Date Mon, 09 Feb 2009 19:44:59 GMT

     [ https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Pradeep Kamath updated PIG-653:

    Attachment: PIG-653-2.comment

A new proposal has been attached as a revision of the proposal in comment 1.

The two main changes are:
1. A new class RequiredFieldList  will be used to convey the list of required fields. A separate
class was chosen here (rather than using the List<RequiredFields> and boolean separately)
since it gives us the flexibility to extend it easily in the future.
2. The new type, BAG_OF_MAP is no longer needed. So if a certain field is a bag (named "bg")
which contains a single column which is a map and we need just the data for only one key (say
k1) from it, we can represent that by having a RequiredField object of Type BAG with alias
"bg". This object will have one RequiredField object in its subFields list which will be of
type MAP and which will have index 0 to indicate this is the first subfield in the bag. This
object inturn will have one RequiredField object in its subFields list which be of type BYTEARRAY
and which will have alias "k1". This illustrates how subcolumns of interest can be represented
by the RequiredField class.

> Make fieldsToRead work in loader
> --------------------------------
>                 Key: PIG-653
>                 URL: https://issues.apache.org/jira/browse/PIG-653
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Alan Gates
>            Assignee: Pradeep Kamath
>         Attachments: PIG-653-2.comment
> Currently pig does not call the fieldsToRead function in LoadFunc, thus it does not provide
information to load functions on what fields are needed.  We need to implement a visitor that
determines (where possible) which fields in a file will be used and relays that information
to the load function.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message