hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Reopened: (PIG-472) load files based on user provided regular expressions
Date Tue, 07 Oct 2008 16:32:44 GMT

     [ https://issues.apache.org/jira/browse/PIG-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates reopened PIG-472:
----------------------------


In general the patch looks good.  A couple of comments and a question:

1) You need to add the Apache License comment to the header of some of the files.  You put
it in some files, but not others.

2) When you submit a patch mark the JIRA as patch available.  The committer will mark it as
resolved when it's checked in.  I'm reopening all three and setting them to patch available.

The question, in RegExLoader.getNext(), you construct a new Matcher for every line.  Would
it be faster to construct one Matcher and call reset() on it for each line?

> load files based on user provided regular expressions
> -----------------------------------------------------
>
>                 Key: PIG-472
>                 URL: https://issues.apache.org/jira/browse/PIG-472
>             Project: Pig
>          Issue Type: New Feature
>          Components: data, grunt
>    Affects Versions: 0.1.0
>            Reporter: Earl Cahill
>             Fix For: 0.1.0
>
>         Attachments: RegExLoader-PIG-472
>
>
> Want to be able to load files based on regular expressions.  Each group specified in
parenthesis should end up as a DataAtom, and the list of DataAtoms should end up in a Tuple.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message