hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xu Zhang (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (PIG-152) Solution for PIG-123 might have introduced parsing errors on Windows with the Pig unit tests
Date Tue, 18 Mar 2008 22:46:24 GMT

    [ https://issues.apache.org/jira/browse/PIG-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580153#action_12580153
] 

xuzh edited comment on PIG-152 at 3/18/08 3:44 PM:
-------------------------------------------------------

Attach the patch.  Changes include adding the static method encodeEscape() to Util.java and
calling this method in the test classes to replace single "\" with double "\".

      was (Author: xuzh):
    Attach the patch.  Changes include adding the static method encodeEscape() to Util.java
and calling this method in the test classes to replace "\" with "\\".
  
> Solution for PIG-123 might have introduced parsing errors on Windows with the Pig unit
tests
> --------------------------------------------------------------------------------------------
>
>                 Key: PIG-152
>                 URL: https://issues.apache.org/jira/browse/PIG-152
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Xu Zhang
>            Assignee: Pi Song
>         Attachments: PIG_152.patch
>
>
> *Since the changes for PIG-123 were checked in, we got lots of parsing errors when running
the unit tests on Windows.*
> It seems all these errors are generated when a query string that contains a single quote
for the path of the data file is passed into the PigServer::registerQuery() method.  
> Here is an example of such an error and the location where it is generated in the unit
test code (on the line "pig.registerQuery(query);" in the following code):
> {noformat}
> org.apache.pig.impl.logicalLayer.parser.TokenMgrError: Lexical error at line 1, column
39.  Encountered: "W" (87), after : "\'file:c:\\"
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParserTokenManager.getNextToken(QueryParserTokenManager.java:1599)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.jj_consume_token(QueryParser.java:4069)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.FileName(QueryParser.java:717)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:615)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:497)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedExpr(QueryParser.java:425)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.GroupItem(QueryParser.java:1043)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.CogroupClause(QueryParser.java:996)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:523)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedExpr(QueryParser.java:425)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:1364)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:552)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:373)
> 	at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:227)
> 	at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:47)
> 	at org.apache.pig.PigServer.registerQuery(PigServer.java:237)
> 	at org.apache.pig.test.TestAlgebraicEval.testSimpleCount(TestAlgebraicEval.java:50)
> Standard Output
> Starting DataNode 0 with dfs.data.dir: dfs\data\data1,dfs\data\data2
> Starting DataNode 1 with dfs.data.dir: dfs\data\data3,dfs\data\data4
> Starting DataNode 2 with dfs.data.dir: dfs\data\data5,dfs\data\data6
> Starting DataNode 3 with dfs.data.dir: dfs\data\data7,dfs\data\data8
> myid =  foreach (group (load 'file:c:\WINDOWS\TEMP\test39837txt') all) generate COUNT($1);
> myid = foreach (group (load 'file:c:\WINDOWS\TEMP\test39838txt') all) generate group,
COUNT($1) ;
> myid = foreach (group (load 'file:c:\WINDOWS\TEMP\test39839txt') all) generate COUNT($1),
group ;
> myid = foreach (group (load 'file:c:\WINDOWS\TEMP\test39840txt' using org.apache.pig.builtin.PigStorage(':'))
by $0) generate group, COUNT($1.$1) ;
> myid = foreach (group (load 'file:c:\WINDOWS\TEMP\test39841txt' using org.apache.pig.builtin.PigStorage(':'))
by $0) generate group, COUNT($1.$1), COUNT($1.$0) ;
> {noformat}
> Here is the corresponding code:
> {code}
> public class TestAlgebraicEval extends TestCase {
>     
> 	MiniCluster cluster = MiniCluster.buildCluster();
>     @Test
>     public void testSimpleCount() throws Throwable {
>         int LOOP_COUNT = 1024;
>         PigServer pig = new PigServer(MAPREDUCE);
>         File tmpFile = File.createTempFile("test", "txt");
>         PrintStream ps = new PrintStream(new FileOutputStream(tmpFile));
>         for(int i = 0; i < LOOP_COUNT; i++) {
>             ps.println(i);
>         }
>         ps.close();
>         String query = "myid =  foreach (group (load 'file:" + tmpFile + "') all) generate
COUNT($1);";
>         System.out.println(query);
>         pig.registerQuery(query);
>         Iterator it = pig.openIterator("myid");
>         tmpFile.delete();
>         Tuple t = (Tuple)it.next();
>         Double count = t.getAtomField(0).numval();
>         assertEquals(count, (double)LOOP_COUNT);
>     }
>     .
>     .
>     .
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message