hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wojciech Langiewicz <wlangiew...@gmail.com>
Subject Re: Problem with NameNode Storage: IMAGE_AND_EDITS Failed
Date Thu, 24 Mar 2011 09:25:59 GMT
Before that there's another exception which shows that someone must have 
changed setting from 64k  back to 1024.

So probably I would like to know if there is a command to manually 
trigger saving metadata from NN that after rebooting everything should 
be ok.

java.io.IOException: Too many open files
         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
         at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:145)
         at 
org.mortbay.jetty.nio.SelectChannelConnector$1.acceptChannel(SelectChannelConnector.java:75)
         at 
org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:498)
         at 
org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:185)
         at 
org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
         at 
org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:707)
         at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)

On 24.03.2011 10:01, Wojciech Langiewicz wrote:
> Hello,
> Yes, it's the only one, it has free space, and is not corrupted locally.
> I have found this exception in namenode logs:
> 2011-03-24 09:46:51,531 WARN org.mortbay.log: /getimage:
> java.io.IOException: GetImage failed. java.lang.NullPointerException
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.getImageFile(FSImage.java:219)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.getFsImageName(FSImage.java:1584)
>
> at
> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:75)
>
> at
> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:70)
>
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>
> at
> org.apache.hadoop.hdfs.server.namenode.GetImageServlet.doGet(GetImageServlet.java:70)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1124)
>
> at
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:826)
>
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115)
>
> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>
> at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:324)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
>
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
> at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
>
> at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
>
>
> It appears there every 5 minutes.
>
> On 24.03.2011 09:51, Harsh J wrote:
>> Hello,
>>
>> Can you verify and confirm if that location (is it the only one?) is a
>> valid one (as in, has free space left, is not corrupt, etc.)?
>>
>> Take a backup of whatever exists at your dfs.name.dir before you
>> proceed doing anything.
>>
>> On Thu, Mar 24, 2011 at 2:13 PM, Wojciech Langiewicz
>> <wlangiewicz@gmail.com> wrote:
>>> Hello,
>>> Right now I'm having this issue on my cluster:
>>> On NameNode web interface at the bottom of the page is table:
>>> NameNode Storage
>>> Storage Directory Type State
>>> /srv/dfs/name IMAGE_AND_EDITS Failed
>>>
>>> I have had this issue before, and I rebooted Hadoop, but lost many
>>> files.
>>> What can I do now that I won't lose files since last backup, and what
>>> this
>>> warning actually means?
>>> Any help would be greatly appreciated.
>>> --
>>> Wojciech Langiewicz
>>>
>>
>>
>>
>


Mime
View raw message