pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koji Noguchi (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (PIG-2752) Infinite (?) parser loop with complex FOREACH expression
Date Mon, 04 Nov 2013 20:04:18 GMT

     [ https://issues.apache.org/jira/browse/PIG-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Koji Noguchi resolved PIG-2752.
-------------------------------

    Resolution: Duplicate

I believe (and hope that) this is resolved in PIG-2769.  Closing as duplicate.

> Infinite (?) parser loop with complex FOREACH expression
> --------------------------------------------------------
>
>                 Key: PIG-2752
>                 URL: https://issues.apache.org/jira/browse/PIG-2752
>             Project: Pig
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.10.0
>         Environment: Linux
>            Reporter: Tushar Pradhan
>
> The following Pig script seems to hang in the parser for Pig 0.10.0. It works fine for
Pig 0.8.1.
> ----
> X = LOAD 'X' USING PigStorage(',') AS (
> term: chararray,dcount: long,dcount_0: long,dcount_1: long,dcount_2: long,dcount_4: long,dcount_5:
long,dcount_6: long,dcount_7: long,dcount_8: long,dcount_9: long,dcount_10: long,dcount_11:
long,dcount_12: long,dcount_13: long,dcount_U: long,dcount_L: long,dcount_C: long,dcount_M:
long,dcount_P: long,dcount_T: long,dcount_S: long,dcount_R: long,dcount_Z: long,dcount_K:
long);
> Y =
>     FOREACH X
>     GENERATE
>         term,
>         (
>             (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_1 >
1 OR dcount_1 == 1 AND dcount == 1) ? 1 : (
>             (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_2 >
1 OR dcount_2 == 1 AND dcount == 1) ? 2 : (
>             (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_7 >
1 OR dcount_7 == 1 AND dcount == 1) ? 7 : (
>             (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_9 >
1 OR dcount_9 == 1 AND dcount == 1) ? 9 : (
>             (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_11 >
1 OR dcount_11 == 1 AND dcount == 1) ? 11 : (
>             dcount_5 > 1 OR dcount_5 == 1 AND dcount == 1 ? 5 : (
>             dcount_6 > 1 OR dcount_6 == 1 AND dcount == 1 ? 6 : (
>             dcount_8 > 1 OR dcount_8 == 1 AND dcount == 1 ? 8 : (
>             dcount_10 > 1 OR dcount_10 == 1 AND dcount == 1 ? 10 : (
>             dcount_12 > 1 OR dcount_12 == 1 AND dcount == 1 ? 12 : (
>             (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_13 >
0 OR dcount_13 == 1 AND dcount == 1) ? 13 : (
>             dcount_4 > 0 ? 4 : 0)))))))))))
>         ) AS besttype;
> STORE Y INTO 'Y';
> ----
> 2012-06-12 08:04:46,435 [main] INFO  org.apache.pig.Main - Apache Pig version 0.10.0-SNAPSHOT
(rexported) compiled May 08 2012, 08:26:29
> 2012-06-12 08:04:46,435 [main] INFO  org.apache.pig.Main - Logging error messages to:
/tmp/pig_1339513486431.log
> 2012-06-12 08:04:46,950 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
> The hang occurs in both local and Hadoop modes
> If I simplify the 'besttype' expression in the FOREACH a bit, the script works fine.
> The input 'X' directory isn't necessary as the processing gets stuck in the parser, but
if needed, can contain a sample 'part-r-00000' file with the line:
> #1,49,1,0,0,0,0,0,0,0,0,0,0,0,48,0,0,0,0,49,1,2,0,0,43



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message