hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (PIG-836) Allow setting of end-of-record delimiter in PigStorage
Date Tue, 21 Sep 2010 21:17:35 GMT

     [ https://issues.apache.org/jira/browse/PIG-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates resolved PIG-836.
----------------------------

    Resolution: Won't Fix

PigStorage now depends on TextInputFormat to parse lines.  It does not allow the user to specify
the end of line indicator.  If it does at some point in the future then Pig can make use of
that.  We are not going to rewrite TextInputFormat for ourselves just to get this feature.

> Allow setting of end-of-record delimiter in PigStorage
> ------------------------------------------------------
>
>                 Key: PIG-836
>                 URL: https://issues.apache.org/jira/browse/PIG-836
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: George Mavromatis
>            Assignee: Benjamin Reed
>
> PigStorage allows overriding the default field delimiter ('\t'), but does not allow overriding
the record delimiter ('\n').
> It is a valid use case that fields contain new lines, e.g. because they are contents
of a document/web page. It is possible for the user to create a custom load/store UDF to achieve
that, but that is extra work on the user, many users will have to do it , and that udf would
be the exact code duplicate of the PigStorage except for the delimiter.
> Thus, PigStorage() should allow to configure both field and record separators.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message