hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-12) Please add timestamps to pig map/reduce progress messages
Date Fri, 07 Dec 2007 17:41:43 GMT

     [ https://issues.apache.org/jira/browse/PIG-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates updated PIG-12:
--------------------------

    Attachment: timestamps.diff

Patch uploaded on behalf of Patrick Hunt.  Comments from his email:

Here is the patch, some things to note:

1) -b/--brief gives brief logging - no timestamps
2) -4/--log4jconf allows user to specify properties conf file which will be "the" log4j configuration
(overrides anything we might do)

3) I tried to keep the semantics of -v and -d the same, see changes to Main.java. Main diff
is that it applies to the root (everyone; pig, hadoop, etc...), rather than just pig as it
did previously. You should verify bw compatibility (if you care about such things).

4) some of the code is using system.out.println (like POMapreduce.java). As a result, obviously,
these messages won't have timestamp. You may/maynot want to clean this up (35 matches in src
hierarchy)

Patrick


> Please add timestamps to pig map/reduce progress messages
> ---------------------------------------------------------
>
>                 Key: PIG-12
>                 URL: https://issues.apache.org/jira/browse/PIG-12
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Olga Natkovich
>         Attachments: timestamps.diff
>
>
> From one of the users: 
> ------------------------------
> I'm spending a lot of time trying to optimize my pig queries for short
> run-times.  This process would be much easier if, in the progress output
> from pig (currently on stdout, but hopefully soon moving to  
> stderr?!), the
> initiation and completion of each map/reduce job could be  
> timestamped.  Pig
> already spits out messages of the form "----- MapReduce Job -----",  
> "Input:
> ...", "Combine: ...", etc; could you just add a "Timestamp: ..."
> field as well?	Or ideally, both "Starting timestamp: ..." and	
> "Finishing
> timestamp ...".
> Additional comments from another user:
> ------------------------------------------------------
> I'm adding my vote for this as well.
> I'd like to know timestamp and "running time" in seconds or D;H:M:S:
> Thu Oct 25 10:06:01 GMT 2007 (0:00:12:56): 56% done
> Starting and stopping timestamps in the log would also be valuable.
> Unforutately, there's no "workaround" such as putting a date command before and after
the pig command in logging --
> queuing times can be seconds to hours and completely mess up any notion of job execution
time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message