hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stu24m...@yahoo.com
Subject Re: hadoop log timestamps & file timestamps not same as system time
Date Thu, 22 Apr 2010 05:39:49 GMT
Cool thx!

I already moved ntp to NN (had the same thought). All machines sync to the same apt-mirror,
but I'll check the TZ files to make sure. Not much left to try...
I even formatted NN, and it didn't go away.

Next would be to dig up hadoop source for timestamps, and run it in a harness, I guess. Could
be java versioning/TZ issue, I suppose (Is java tzdata managed separately?)

We'll see

Sent from my Verizon Wireless BlackBerry

-----Original Message-----
From: Eric Sammer <esammer@cloudera.com>
Date: Thu, 22 Apr 2010 01:17:25 
To: <hdfs-user@hadoop.apache.org>
Subject: Re: hadoop log timestamps & file timestamps not same as system time

My initial inclination is to check the timezone of the machines. An
hour difference smells like DST shift. Check that the OS are the same
version / patch level (as zoneinfo files may have changed based on
legislation recently passed). Confirm that all machines are living in
the same zone. Sync'ing to ntp is good too although an hour outage
should be fine. If you want to protect against ntp drift due to
outage, you can run a canonical ntp server inside the cluster. Given
the NN has to be available for the cluster to function, that might be
a good place to run one and just have the DN / TT nodes sync to it.


On Wed, Apr 21, 2010 at 6:46 PM, Stuart Smith <stu24mail@yahoo.com> wrote:
> Hello,
>  I've noticed an odd situation:
> The timestamps in my hadoop namenode log (and filesystem) appear to be an hour behind
the actual time on the system (as supplied by `date`).
> I checked that this wasn't user error by stopping and starting hadoop, and checking the
most recent messages (as well as the last time the file was modified).
> For example, you can see the last modified time, and the timestamp on the shutdown log
message different significantly:
> stu@ubuntu-namenode:~/hadoop/logs$ ls -l hadoop-stu-namenode-ubuntu-namenode.log
> -rw-r--r-- 1 stu stu 22148124 2010-04-21 15:33 hadoop-stu-namenode-ubuntu-namenode.log
> stu@ubuntu-namenode:~/hadoop/logs$ tail -f hadoop-stu-namenode-ubuntu-namenode.log
> ....
> 2010-04-21 14:33:19,032 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at ubuntu-namenode/
> ************************************************************/
> I also checked this while hadoop was running: the messages are consistently an hour behind.
> More troubling, this appears affect the filesystem timestamps as well.
> stu@ubuntu-namenode:~/hadoop/bin$ date
> Wed Apr 21 15:53:34 PDT 2010
> stu@ubuntu-namenode:~/hadoop/bin$ ./hadoop dfs -touchz /test
> stu@ubuntu-namenode:~/hadoop/bin$ ./hadoop dfs -ls /
> Found 2 items
> drwxr-xr-x   - stu supergroup          0 2010-04-20 16:51 /home
> -rw-r--r--   3 stu supergroup          0 2010-04-21 14:53 /test
> Interestingly, the datanodes appear to be fine:
> stu@ubuntu-hadoop-2:~/hadoop/logs$ ls -l hadoop-stu-datanode-ubuntu-hadoop-2.log
> -rw-r--r-- 1 stu stu 4918505 2010-04-21 15:47 hadoop-stu-datanode-ubuntu-hadoop-2.log
> stu@ubuntu-hadoop-2:~/hadoop/logs$ tail -n 1 hadoop-stu-datanode-ubuntu-hadoop-2.log
> 2010-04-21 15:47:21,875 INFO org.mortbay.log: jetty-6.1.14
> Is there something wrong here? Am I just doing something crazy/dumb?
> Note that I discovered this after getting some rather odd behavior from hbase - hbase
appears to work fine, except it's continually adding the regionservers onto the list so, in
my itty-bitty cluster of about 4 machines + 1 namenode, the master reports back ~ 1000s of
regionservers and growing.. There are indications timing issues involved in that as well,
but I figure I should get hdfs working first :)
> Note that I do have NTP running to keep all these boxes in sync, but the NTP server will
go down for an hour or so intermittently (due to some network issues that don't involve the
hadoop cluster). All the system times on the machines appear to be (at least) within a few
seconds of each other.
> I'm running hadoop 0.20.2
> Any thoughts? Any debugging guidance? I'm pretty stumped.
> Take care,
>  -stu

Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com
View raw message