hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Hackman (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-1581) Parser fails to recognize semicolons in quoted strings
Date Mon, 30 Aug 2010 19:06:54 GMT
Parser fails to recognize semicolons in quoted strings
------------------------------------------------------

                 Key: PIG-1581
                 URL: https://issues.apache.org/jira/browse/PIG-1581
             Project: Pig
          Issue Type: Bug
          Components: grunt
    Affects Versions: 0.7.0
         Environment: CentOS 5.5
            Reporter: Christopher Hackman
            Priority: Minor


Within some contexts, the parser fails to treat semicolons correctly, and sees them as an
EOL.


Given an input file:

/test1.txt (in the hdfs)
1;a
2;b
3;c
4;d
5;e


And the following Pig script:

REGISTER /tmp/piggybank.jar ;
DEFINE REGEXEXTRACTALL org.apache.pig.piggybank.evaluation.string.RegexExtractAll();
lines = LOAD '/test1.txt' AS (line:chararray);
delimited = FOREACH lines GENERATE FLATTEN (
        REGEXEXTRACTALL(line, '^(\\d+);(\\w+)$')
) AS (
        digit:int,
        word:chararray
);
DUMP delimited;


I receive the following error:

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Lexical error at
line 5, column 40.  Encountered: <EOF> after : "\'^(\\\\d+);"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message