hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "George Mavromatis (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-836) Allow setting of end-of-record delimiter in PigStorage
Date Fri, 05 Jun 2009 05:09:07 GMT
Allow setting of end-of-record delimiter in PigStorage

                 Key: PIG-836
                 URL: https://issues.apache.org/jira/browse/PIG-836
             Project: Pig
          Issue Type: Improvement
          Components: impl
            Reporter: George Mavromatis
             Fix For: 0.2.0

PigStorage allows overriding the default field delimiter ('\t'), but does not allow overriding
the record delimiter ('\n').

It is a valid use case that fields contain new lines, e.g. because they are contents of a
document/web page. It is possible for the user to create a custom load/store UDF to achieve
that, but that is extra work on the user, many users will have to do it , and that udf would
be the exact code duplicate of the PigStorage except for the delimiter.

Thus, PigStorage() should allow to configure both field and record separators.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message