hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Charlie Groves (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-55) Allow user control over split creation
Date Tue, 01 Apr 2008 21:04:24 GMT

     [ https://issues.apache.org/jira/browse/PIG-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Charlie Groves updated PIG-55:

    Attachment: pig_chunker_split_v6.patch

Alright, I'm uploading pig_chunker_split_v6.patch which converts TestParser to a Junit 3 style

There are several reasons to removal the JUnit 4 annotations from the existing tests:
# Since they're not being used to run the tests, it can't be assumed that they're actually
on all the tests and we can just switch JUnit 4 on and use them.  They're definitely only
on a mishmash of the existing tests, so to make that switch, you're going to have to go through
and check the annotations on all of the tests anyway.
# If you're going to use the annotations, its better to rename the methods and add the annotations
than sticking with the stilted test prefix method names.
# It confuses people outside the project like me into writing JUnit 4 style tests.  In PIG-145
I've got a fair number of JUnit 4 tests that I need to convert now, some of which use features
like the test parameterization which is going to be more significant to retrofit.
# It's just pointless noise in the code until they're actually doing something.

> Allow user control over split creation
> --------------------------------------
>                 Key: PIG-55
>                 URL: https://issues.apache.org/jira/browse/PIG-55
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.0.0
>            Reporter: Charlie Groves
>             Fix For: 0.1.0
>         Attachments: pig_chunker_split.patch, pig_chunker_split_v2.patch, pig_chunker_split_v3.patch,
pig_chunker_split_v4.patch, pig_chunker_split_v5.patch, pig_chunker_split_v6.patch, replaceable_PigSplit.diff,
> I have a dataset in HDFS that's stored in a file per column that I'd like to access from
pig.  This means I can't use LoadFunc to get at the data as it only allows the loader access
to a single input stream at a time.  To handle this usage, I've broken the existing split
creation code out into a few classes and interfaces, and allowed user specified load functions
to be used in place of the existing code.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message