hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-12) Please add timestamps to pig map/reduce progress messages
Date Fri, 07 Dec 2007 17:41:43 GMT

     [ https://issues.apache.org/jira/browse/PIG-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Alan Gates updated PIG-12:

    Attachment: timestamps.diff

Patch uploaded on behalf of Patrick Hunt.  Comments from his email:

Here is the patch, some things to note:

1) -b/--brief gives brief logging - no timestamps
2) -4/--log4jconf allows user to specify properties conf file which will be "the" log4j configuration
(overrides anything we might do)

3) I tried to keep the semantics of -v and -d the same, see changes to Main.java. Main diff
is that it applies to the root (everyone; pig, hadoop, etc...), rather than just pig as it
did previously. You should verify bw compatibility (if you care about such things).

4) some of the code is using system.out.println (like POMapreduce.java). As a result, obviously,
these messages won't have timestamp. You may/maynot want to clean this up (35 matches in src


> Please add timestamps to pig map/reduce progress messages
> ---------------------------------------------------------
>                 Key: PIG-12
>                 URL: https://issues.apache.org/jira/browse/PIG-12
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Olga Natkovich
>         Attachments: timestamps.diff
> From one of the users: 
> ------------------------------
> I'm spending a lot of time trying to optimize my pig queries for short
> run-times.  This process would be much easier if, in the progress output
> from pig (currently on stdout, but hopefully soon moving to  
> stderr?!), the
> initiation and completion of each map/reduce job could be  
> timestamped.  Pig
> already spits out messages of the form "----- MapReduce Job -----",  
> "Input:
> ...", "Combine: ...", etc; could you just add a "Timestamp: ..."
> field as well?	Or ideally, both "Starting timestamp: ..." and	
> "Finishing
> timestamp ...".
> Additional comments from another user:
> ------------------------------------------------------
> I'm adding my vote for this as well.
> I'd like to know timestamp and "running time" in seconds or D;H:M:S:
> Thu Oct 25 10:06:01 GMT 2007 (0:00:12:56): 56% done
> Starting and stopping timestamps in the log would also be valuable.
> Unforutately, there's no "workaround" such as putting a date command before and after
the pig command in logging --
> queuing times can be seconds to hours and completely mess up any notion of job execution

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message