pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-4757) Job stats on successfully read/output records wrong with multiple inputs/outputs
Date Wed, 06 Jan 2016 18:47:40 GMT

     [ https://issues.apache.org/jira/browse/PIG-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rohini Palaniswamy updated PIG-4757:
------------------------------------
    Attachment: PIG-4757-3.patch

  Realized that the patch missed the testcase file. Added that. [~daijy], could you review
that?

> Job stats on successfully read/output records wrong with multiple inputs/outputs
> --------------------------------------------------------------------------------
>
>                 Key: PIG-4757
>                 URL: https://issues.apache.org/jira/browse/PIG-4757
>             Project: Pig
>          Issue Type: Bug
>          Components: tez
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.16.0
>
>         Attachments: PIG-4757-1.patch, PIG-4757-2.patch, PIG-4757-3.patch
>
>
> TezVertexStats uses TaskCounter.INPUT_RECORDS_PROCESSED to display records read from
MRInput. But in cases of replicate join or scalar it also includes replicate join input. 
Need to have a pig specific counter (MULTI_INPUTS_RECORD_COUNTER) in POSimpleTezLoad.
> TezVertexStats uses TaskCounter.OUTPUT_RECORDS to display records stored to MROutput
if there is single store. If there are multiple stores it uses MULTI_STORE_RECORD_COUNTER
and there are no issues. If there is a single store with another output, then value from OUTPUT_RECORDS
is wrong. Need to use MULTI_STORE_RECORD_COUNTER for all cases even if there is no multiple
store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message