hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pradeep Kamath (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-1316) TextLoader should use Bzip2TextInputFormat for bzip files so that bzip files can be efficiently processed by splitting the files
Date Wed, 24 Mar 2010 21:00:27 GMT

     [ https://issues.apache.org/jira/browse/PIG-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pradeep Kamath updated PIG-1316:
--------------------------------

    Attachment: PIG-1316.patch

Attached patch makes the required changes in TextLoader to use BZip2TextInputFormat if the
load location ends with extension ".bz" or ".bz2" like PigStorage. Also for non bzip data,
TextLoader will now use PigTextInputFormat rather than TextInputFormat so that input directories
can be recursively traversed. I have also changed BZip2TextInputFormat to extend PigFileInputFormat
instead of FileInputFormat for the same reason.

> TextLoader should use Bzip2TextInputFormat for bzip files so that bzip files can be efficiently
processed by splitting the files
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1316
>                 URL: https://issues.apache.org/jira/browse/PIG-1316
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.7.0
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.7.0
>
>         Attachments: PIG-1316.patch
>
>
> Currently TextLoader uses TextInputFormat which does not split bzip files - this can
be fixed by using Bzip2TextInputformat.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message