pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Murshid Chalaev (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-5019) Pig generates tons of warnings for udf with enabled warnings aggregation
Date Mon, 12 Sep 2016 09:28:20 GMT

     [ https://issues.apache.org/jira/browse/PIG-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Murshid Chalaev updated PIG-5019:
---------------------------------
    Attachment: PIG-5019_2.patch

> Pig generates tons of warnings for udf with enabled warnings aggregation
> ------------------------------------------------------------------------
>
>                 Key: PIG-5019
>                 URL: https://issues.apache.org/jira/browse/PIG-5019
>             Project: Pig
>          Issue Type: Bug
>          Components: internal-udfs
>    Affects Versions: 0.14.0
>            Reporter: Murshid Chalaev
>            Assignee: Murshid Chalaev
>             Fix For: 0.16.1
>
>         Attachments: PIG-5019.patch, PIG-5019_2.patch, input_example.gz, test_pig14_udf
.pig
>
>
> For data set containing 9 lines the aggregated warning message is displayed 
> {code}
> 2016-09-01 19:40:33,664 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Encountered Warning UDF_WARNING_1 6 time(s).
> {code}
> but in contained logs I see a separate log message "Cannot
> extract group for input" for every not matching value
> {code}
> 2016-09-01 19:40:28,115 INFO [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map:
Aliases being processed per job phase (AliasName[line,offset]): M
> : b[10,4],b[-1,-1],extract_fields[17,17] C:  R: 
> 2016-09-01 19:40:28,122 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger:
org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtrac
> t : Cannot extract group for input /v1=1&v3=9
> 2016-09-01 19:40:28,124 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger:
org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtrac
> t : Cannot extract group for input /v2=3&v3=7
> 2016-09-01 19:40:28,124 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger:
org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtract : Cannot extract group for
input /v1=4&v3=6
> 2016-09-01 19:40:28,125 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger:
org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtract : Cannot extract group for
input /v2=5&v3=5
> 2016-09-01 19:40:28,125 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger:
org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtract : Cannot extract group for
input /v1=8&v3=2
> 2016-09-01 19:40:28,125 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger:
org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtract : Cannot extract group for
input /v3=9&v2=1
> {code}
> It does not log the warning messages in the task logs.
> The patch for PIG-2207 was committed to
> Pig 0.13+
> In 0.12 we had a single counter for all UDF warnings, but in  0.13+ we have
> separate counter and message for every unique warning log line. 
> Two lines below are unique
> /v2=3&v3=7
> /v1=4&v3=6
> That's why Pig print both of them to the console.
> Printing a separate log message for every data line slows down the overall performance
as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message