pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-2143) Improvements for PigStorage
Date Thu, 14 Jul 2011 16:43:00 GMT

     [ https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitriy V. Ryaboy updated PIG-2143:
-----------------------------------

    Attachment: PIG-2143.2.diff

Thanks for the reviews.

Uploading a patch that fixes the repeated deserialization (nice catch!), adjusts whitespace,
and makes the piggybank stuff shallow deprecated proxies for the builtins.

I am not sure if loading the schema when it was created but isn't being requested is a good
idea.. can see arguments both ways.

I do think we should allow loading with a different delimiter than that set in the schema.

> Improvements for PigStorage
> ---------------------------
>
>                 Key: PIG-2143
>                 URL: https://issues.apache.org/jira/browse/PIG-2143
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.10
>
>         Attachments: PIG-2143.2.diff, PIG-2143.diff
>
>
> I'd like to propose that we allow for a greater degree of customization in PigStorage.
> An incomplete list features that we might want to add:
> - flag to tell it to overwrite existing output if it exists
> - flag to tell it to compress output using gzip|bzip|lzo (currently this can be achieved
by setting the directory name to end in .gz or .bz2, which is a bit awkward)
> - flag to tell it to store the schema and header (perhaps by merging in PigStorageSchema
work?)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message