pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Malaska (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2494) Improvement to SequenceFileLoader (NullWritable and Delimiter)
Date Tue, 04 Sep 2012 21:38:08 GMT

    [ https://issues.apache.org/jira/browse/PIG-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448084#comment-13448084
] 

Ted Malaska commented on PIG-2494:
----------------------------------

So I have four options on how I should address this issue #.

1. Update Sequence Loader so that it will be able to handle nullWritable keys and also handle
delimiters like PigStorage.
2. All of option (1) plus update sequence loader to sequence storage so we can use it to dump
out data in sequence files.
3. Bring the elephant-bird implementation over to piggybank and add support for delimiters.
4. Drop the whole delimiter thing because we can use TOKENIZE

Let me know.



                
> Improvement to SequenceFileLoader (NullWritable and Delimiter)
> --------------------------------------------------------------
>
>                 Key: PIG-2494
>                 URL: https://issues.apache.org/jira/browse/PIG-2494
>             Project: Pig
>          Issue Type: Improvement
>          Components: piggybank
>    Affects Versions: 0.9.1
>         Environment: All
>            Reporter: Ted Malaska
>            Priority: Minor
>              Labels: newbie, simple
>         Attachments: SequenceFileLoader.java
>
>
> I wanted to add two features to SequenceFileLoader.
> 1.	I added a delimiter so it will act more like PigStorage, in that it will Split the
value if it is a type Text (chararray).
> 2.	I added the option of the key being a NullWritable.  I wanted to be able to process
my Hive files in both Hive and Pig, but because my Hive sequence files have a NullWritable
key I could not make this work with the current implementation of SequenceFileLoader.
> My change is attached to this Issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message