pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Markham (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-842) PigStorage should support multi-byte delimiters
Date Wed, 02 Jan 2013 04:02:13 GMT

     [ https://issues.apache.org/jira/browse/PIG-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeff Markham updated PIG-842:
-----------------------------

    Attachment: PigMultiByteTextOutputFormat.java
                PigMultiByteStorage.java
                PigMultiByteJsonMetadata.java

PigStorage has drifted since this JIRA was opened such that there'd be more work than just
extending PigStorage.  Attached are the 3 classes to make it work.

Feedback welcome as to whether these could turn into the commits for this or if PigStorage
could be refactored whereby it could be extended with a getNext() override.
                
> PigStorage should support multi-byte delimiters
> -----------------------------------------------
>
>                 Key: PIG-842
>                 URL: https://issues.apache.org/jira/browse/PIG-842
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.3.0
>            Reporter: Santhosh Srinivasan
>         Attachments: PigMultiByteJsonMetadata.java, PigMultiByteStorage.java, PigMultiByteTextOutputFormat.java
>
>
> Currently, PigStorage supports single byte delimiters. Users have requested mult-byte
delimiters. There are performance implications with multi-byte delimiters. i.e., instead of
looking for a single byte, PigStorage should look for a pattern ala BinStorage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message