hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted <r6squee...@gmail.com>
Subject Re: pauses during startup (maybe network related?)
Date Fri, 24 May 2013 02:12:48 GMT
thanks, I'm almost 100% sure it's network related now.

What I tested was unpluggin my network :), the entire system starts in
just a few seconds.

I decided to search on "reverse dns" in google and I see other people
have complained about very slow reverse dns lookups (some related to
hadoop / hbase too).

I'm not sure why this is happenning yet though. I thought 127.0.0.1 or
localhost would have just resolved instantly - but it appears it's
some how finding my real IP instead, i.e. 192.168.1.5 seems to show up
in the log entries even though all my configurations say
localhost/127.0.0.1 and my /etc/hosts file has and entry for
localhost/127.0.0.1

I think if I make a /etc/hosts entry for 192.168.1.5 everything will
be quick, that's what I'm going to test later. The only problem is I'm
on an dynamic IP... I've considered just making entries for all
reasonable permutations like 192.168.1.1 through 192.168.1.20... but
I'm still more just miffed at how it's knowing I'm a 192 address when
I told it to use localhost.

On 5/24/13, Chris Nauroth <cnauroth@hortonworks.com> wrote:
> Hi Ted,
>
> 2013-05-23 19:28:19,937 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names
> occuring more than 10 times
> ...
> 2013-05-23 19:28:26,801 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 28 on 9000: starting
>
> There are a couple of relevant activities that happen during namenode
> startup in between these 2 log statements.  It loads the current fsimage
> (persistent copy of file system metadata), merges in the edits log
> (transaction log containing all file system metadata changes since the last
> checkpoint), and then saves back a new fsimage file after that merge.
>  Current versions of the Hadoop codebase will print some information to
> logs about the volume of activity during this checkpointing process, so I
> recommend looking for that in your logs to see if this explains it.
>  Depending on whether or not your have a large number of transactions
> queued since your last checkpoint, this whole process can cause namenode
> startup to take several minutes.
>
> If this becomes a regular problem, then you can run SecondaryNameNode or
> BackupNode to perform periodic checkpoints in addition to the checkpoint
> that occurs on namenode restart.  This is probably overkill for a dev
> environment on your laptop though.
>
> Hope this helps,
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Thu, May 23, 2013 at 2:49 AM, Ted <r6squeegee@gmail.com> wrote:
>
>> Hi I'm running hadoop on my local laptop for development and
>> everything "works" but there's some annoying pauses during the startup
>> which causes the entire hadoop startup process to take up to 4 minutes
>> and I'm wondering what it is and if I can do anything about it.
>>
>> I'm running everything on 1 machines, on fedora linux, hadoop-1.1.2,
>> oracle jkd1.7.0_17, the machine is a dual core i5, and I have 8gb of
>> ram and an SSD so it shouldn't be slow.
>>
>> When the system pauses, there is no cpu usage, no disk usage and no
>> network usage (although I suspect it's waiting for the network to
>> resolve or return something).
>>
>> Here's some snippets from the namenode logs during startup where you
>> can see it just pauses for around 30 seconds or more with out errors
>> or anything :
>>
>> ...
>> 2013-05-23 19:26:37,660 INFO
>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>> hadoop-metrics2.properties
>> 2013-05-23 19:26:37,676 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2013-05-23 19:27:54,144 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>> period at 10 second(s).
>> 2013-05-23 19:27:54,144 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics
>> system started
>> ...
>> 2013-05-23 19:27:54,341 WARN
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: The
>> dfs.support.append option is in your configuration, however append is
>> not supported. This configuration option is no longer required to
>> enable sync.
>> 2013-05-23 19:27:54,341 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>> accessTokenLifetime=0 min(s)
>> 2013-05-23 19:28:19,918 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>> FSNamesystemStateMBean and NameNodeMXBean
>> 2013-05-23 19:28:19,937 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names
>> occuring more than 10 times
>> ...
>> 2013-05-23 19:28:26,801 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 28 on 9000: starting
>> 2013-05-23 19:28:26,833 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 31 on 9000: starting
>> 2013-05-23 19:30:10,644 INFO org.apache.hadoop.hdfs.StateChange:
>> BLOCK* NameSystem.registerDatanode: node registration from
>> 127.0.0.1:50010 storage DS-651015167-192.168.1.5-50010-1369140176513
>> 2013-05-23 19:30:10,650 INFO org.apache.hadoop.net.NetworkTopology:
>> Adding a new node: /default-rack/127.0.0.1:50010
>>
>>
>> I already start the system with : export
>> HADOOP_OPTS="-Djava.net.preferIPv4Stack=true"
>> I only allocate : export HADOOP_HEAPSIZE=512 (but it's an empty hadoop
>> system, maybe just 1 or 2 test files less than 100k, and there's no
>> CPU usage so it doesn't look like it's GC thrashing)
>>
>> I should mention again, there's no errors and the system runs fine and
>> relatively speedy once started (considering it's on my laptop).
>>
>> Does anyone know what's causing these pauses? (and how I can get rid of
>> them)
>> Thanks.
>> --
>> Ted.
>>
>


-- 
Ted.

Mime
View raw message