hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted <r6squee...@gmail.com>
Subject Re: pauses during startup (maybe network related?)
Date Fri, 24 May 2013 08:36:55 GMT
just an FYI in case anyone finds this thread in a web search....

I just edited my /etc/hosts file and added a mapping of

     127.0.0.1 <my machine name>

and everything starts up almost instantly, the difference is night and day :)

On 5/24/13, Harsh J <harsh@cloudera.com> wrote:
> You are spot on about the DNS lookup slowing things down. I've faced
> the same issue (before I had a local network DNS set up for the WiFi
> network I use).
>
>> but I'm still more just miffed at how it's knowing I'm a 192 address when
>> I told it to use localhost.
>
> There's a few configs you need to additionally change to make a
> perfect localhost setup. Otherwise, there are defaults in Apache
> Hadoop that bind to 0.0.0.0 and report the current system hostname
> (which changes if you get onto a network), causing what you're seeing.
>
> On Fri, May 24, 2013 at 7:42 AM, Ted <r6squeegee@gmail.com> wrote:
>> thanks, I'm almost 100% sure it's network related now.
>>
>> What I tested was unpluggin my network :), the entire system starts in
>> just a few seconds.
>>
>> I decided to search on "reverse dns" in google and I see other people
>> have complained about very slow reverse dns lookups (some related to
>> hadoop / hbase too).
>>
>> I'm not sure why this is happenning yet though. I thought 127.0.0.1 or
>> localhost would have just resolved instantly - but it appears it's
>> some how finding my real IP instead, i.e. 192.168.1.5 seems to show up
>> in the log entries even though all my configurations say
>> localhost/127.0.0.1 and my /etc/hosts file has and entry for
>> localhost/127.0.0.1
>>
>> I think if I make a /etc/hosts entry for 192.168.1.5 everything will
>> be quick, that's what I'm going to test later. The only problem is I'm
>> on an dynamic IP... I've considered just making entries for all
>> reasonable permutations like 192.168.1.1 through 192.168.1.20... but
>> I'm still more just miffed at how it's knowing I'm a 192 address when
>> I told it to use localhost.
>>
>> On 5/24/13, Chris Nauroth <cnauroth@hortonworks.com> wrote:
>>> Hi Ted,
>>>
>>> 2013-05-23 19:28:19,937 INFO
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names
>>> occuring more than 10 times
>>> ...
>>> 2013-05-23 19:28:26,801 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 28 on 9000: starting
>>>
>>> There are a couple of relevant activities that happen during namenode
>>> startup in between these 2 log statements.  It loads the current fsimage
>>> (persistent copy of file system metadata), merges in the edits log
>>> (transaction log containing all file system metadata changes since the
>>> last
>>> checkpoint), and then saves back a new fsimage file after that merge.
>>>  Current versions of the Hadoop codebase will print some information to
>>> logs about the volume of activity during this checkpointing process, so
>>> I
>>> recommend looking for that in your logs to see if this explains it.
>>>  Depending on whether or not your have a large number of transactions
>>> queued since your last checkpoint, this whole process can cause namenode
>>> startup to take several minutes.
>>>
>>> If this becomes a regular problem, then you can run SecondaryNameNode or
>>> BackupNode to perform periodic checkpoints in addition to the checkpoint
>>> that occurs on namenode restart.  This is probably overkill for a dev
>>> environment on your laptop though.
>>>
>>> Hope this helps,
>>>
>>> Chris Nauroth
>>> Hortonworks
>>> http://hortonworks.com/
>>>
>>>
>>>
>>> On Thu, May 23, 2013 at 2:49 AM, Ted <r6squeegee@gmail.com> wrote:
>>>
>>>> Hi I'm running hadoop on my local laptop for development and
>>>> everything "works" but there's some annoying pauses during the startup
>>>> which causes the entire hadoop startup process to take up to 4 minutes
>>>> and I'm wondering what it is and if I can do anything about it.
>>>>
>>>> I'm running everything on 1 machines, on fedora linux, hadoop-1.1.2,
>>>> oracle jkd1.7.0_17, the machine is a dual core i5, and I have 8gb of
>>>> ram and an SSD so it shouldn't be slow.
>>>>
>>>> When the system pauses, there is no cpu usage, no disk usage and no
>>>> network usage (although I suspect it's waiting for the network to
>>>> resolve or return something).
>>>>
>>>> Here's some snippets from the namenode logs during startup where you
>>>> can see it just pauses for around 30 seconds or more with out errors
>>>> or anything :
>>>>
>>>> ...
>>>> 2013-05-23 19:26:37,660 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>>> hadoop-metrics2.properties
>>>> 2013-05-23 19:26:37,676 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> MetricsSystem,sub=Stats registered.
>>>> 2013-05-23 19:27:54,144 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>>> period at 10 second(s).
>>>> 2013-05-23 19:27:54,144 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics
>>>> system started
>>>> ...
>>>> 2013-05-23 19:27:54,341 WARN
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: The
>>>> dfs.support.append option is in your configuration, however append is
>>>> not supported. This configuration option is no longer required to
>>>> enable sync.
>>>> 2013-05-23 19:27:54,341 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>>> accessTokenLifetime=0 min(s)
>>>> 2013-05-23 19:28:19,918 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>> FSNamesystemStateMBean and NameNodeMXBean
>>>> 2013-05-23 19:28:19,937 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names
>>>> occuring more than 10 times
>>>> ...
>>>> 2013-05-23 19:28:26,801 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 28 on 9000: starting
>>>> 2013-05-23 19:28:26,833 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 31 on 9000: starting
>>>> 2013-05-23 19:30:10,644 INFO org.apache.hadoop.hdfs.StateChange:
>>>> BLOCK* NameSystem.registerDatanode: node registration from
>>>> 127.0.0.1:50010 storage DS-651015167-192.168.1.5-50010-1369140176513
>>>> 2013-05-23 19:30:10,650 INFO org.apache.hadoop.net.NetworkTopology:
>>>> Adding a new node: /default-rack/127.0.0.1:50010
>>>>
>>>>
>>>> I already start the system with : export
>>>> HADOOP_OPTS="-Djava.net.preferIPv4Stack=true"
>>>> I only allocate : export HADOOP_HEAPSIZE=512 (but it's an empty hadoop
>>>> system, maybe just 1 or 2 test files less than 100k, and there's no
>>>> CPU usage so it doesn't look like it's GC thrashing)
>>>>
>>>> I should mention again, there's no errors and the system runs fine and
>>>> relatively speedy once started (considering it's on my laptop).
>>>>
>>>> Does anyone know what's causing these pauses? (and how I can get rid of
>>>> them)
>>>> Thanks.
>>>> --
>>>> Ted.
>>>>
>>>
>>
>>
>> --
>> Ted.
>
>
>
> --
> Harsh J
>


-- 
Ted.

Mime
View raw message