hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Zhang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1110) Handle compressed file formats -- Gz, BZip with the new proposal
Date Tue, 15 Dec 2009 02:29:18 GMT

    [ https://issues.apache.org/jira/browse/PIG-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790529#action_12790529
] 

Jeff Zhang commented on PIG-1110:
---------------------------------

Response to Richard,

1. If you worry about the API compatibility of PigStorage() since PigStorage() is the default
LoadFunc of Pig,  there's another option that we can provide another LoadFunc having the ability
of compression, I mean we can create a new LoadFunc such as Bz2PigStorage(). 

2. Actually the file name in Store statement is the folder name not the file name, we will
get part-00000.bz2 under this folder. The part-00000.bz2 is the real file which is consumed
by hadoop. Hadoop will check the file name rather the folder name to determine the compression
codec.



> Handle compressed file formats -- Gz, BZip with the new proposal
> ----------------------------------------------------------------
>
>                 Key: PIG-1110
>                 URL: https://issues.apache.org/jira/browse/PIG-1110
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>         Attachments: PIG-1110.patch, PIG_1110_Jeff.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message