hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "George Mavromatis (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-836) Allow setting of end-of-record delimiter in PigStorage
Date Fri, 05 Jun 2009 05:09:07 GMT
Allow setting of end-of-record delimiter in PigStorage
------------------------------------------------------

                 Key: PIG-836
                 URL: https://issues.apache.org/jira/browse/PIG-836
             Project: Pig
          Issue Type: Improvement
          Components: impl
            Reporter: George Mavromatis
             Fix For: 0.2.0


PigStorage allows overriding the default field delimiter ('\t'), but does not allow overriding
the record delimiter ('\n').

It is a valid use case that fields contain new lines, e.g. because they are contents of a
document/web page. It is possible for the user to create a custom load/store UDF to achieve
that, but that is extra work on the user, many users will have to do it , and that udf would
be the exact code duplicate of the PigStorage except for the delimiter.

Thus, PigStorage() should allow to configure both field and record separators.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message