chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiaqi Tan (JIRA)" <>
Subject [jira] Commented: (CHUKWA-305) Inconsistent time inputs
Date Wed, 17 Jun 2009 18:52:07 GMT


Jiaqi Tan commented on CHUKWA-305:

It's not obvious if this is a Chukwa bug or a Hadoop issue, but if Hadoop is emitting logs
that do not have timezones, then there's nothing Chukwa can do about it.

To be more specific, I ran into this problem when I generated data using a cluster in EDT,
and then processed the data on a system in PDT. The times in the Job History logs are correct
since they are stored in UTC, but the log-processing on the machine in PDT reads the EDT time
strings and assumes they are in PDT, resulting in the data from the text-based log sources
being 3 hours ahead of the data in the Job History logs.

> Inconsistent time inputs
> ------------------------
>                 Key: CHUKWA-305
>                 URL:
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection, Data Processors
>            Reporter: Jiaqi Tan
> Times in Job History logs are stored in UTC seconds from Epoch, but times in log-based
sources e.g. clienttrace, and the daemon (DataNode, TaskTracker, JobTracker, NameNode) logs
are in local timezones and in ISO-8601 strings, and do not have the timezone they are recorded
in. This leads to inconsistencies when trying to correlate data in time across log-based sources
and Job History data because the timezone of the data for the log-based sources which emit
times in human-readable strings do not record the timezone. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message