pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Malaska (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2494) Improvement to SequenceFileLoader (NullWritable and Delimiter)
Date Tue, 04 Sep 2012 18:05:07 GMT

    [ https://issues.apache.org/jira/browse/PIG-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447870#comment-13447870
] 

Ted Malaska commented on PIG-2494:
----------------------------------

Hey Dmitriy,

I know it's been a long time but I'm going to try to finish this Issue # now. 

I just reviewed the SequenceFileLoader code in elephant-bird and it looks like the major piece
to bring over is the idea of the converter and it's ability to transform the raw data and
provide a schema for the outputting format.

This would add a lot of power to the existing implementation.

I'll start on this tonight.
                
> Improvement to SequenceFileLoader (NullWritable and Delimiter)
> --------------------------------------------------------------
>
>                 Key: PIG-2494
>                 URL: https://issues.apache.org/jira/browse/PIG-2494
>             Project: Pig
>          Issue Type: Improvement
>          Components: piggybank
>    Affects Versions: 0.9.1
>         Environment: All
>            Reporter: Ted Malaska
>            Priority: Minor
>              Labels: newbie, simple
>         Attachments: SequenceFileLoader.java
>
>
> I wanted to add two features to SequenceFileLoader.
> 1.	I added a delimiter so it will act more like PigStorage, in that it will Split the
value if it is a type Text (chararray).
> 2.	I added the option of the key being a NullWritable.  I wanted to be able to process
my Hive files in both Hive and Pig, but because my Hive sequence files have a NullWritable
key I could not make this work with the current implementation of SequenceFileLoader.
> My change is attached to this Issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message