hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali S Kureishy <safdar.kurei...@gmail.com>
Subject Re: Consistent "register getProtocolVersion" error due to "Duplicate metricsName:getProtocolVersion" during cluster startup -- then various other errors during job execution
Date Sat, 25 Feb 2012 06:36:04 GMT
Hi again,

Would you be able to make any suggestions to the below?

Thanks in advance...

Safdar
On Feb 21, 2012 12:04 PM, "Ali S Kureishy" <safdar.kureishy@gmail.com>
wrote:

> Hi,
>
> I've got a pseudo-distributed Hadoop (v0.20.02) setup with 1 machine (with
> Ubuntu 10.04 LTS) running all the hadoop processes (NN + SNN + JT + TT +
> DN). I've also configured the files under conf/ so that the master is
> referred to by its actual machine name (in this case, *bali*), instead of
> localhost (however, the issue below is seen regardless). I was able to
> successfully format the HDFS (by running hadoop namenode –format). However,
> right after I deploy the cluster using bin/start-all.sh, I see the
> following error in the NameNode's config file. It is an INFO error, but I
> believe it is the root cause behind various other errors I am encountering
> when executing actual Hadoop jobs. (For instance, at one point I see errors
> that the datanode and namenode were communicating using different protocol
> versions ... 3 vs 6 etc.). Anyway, here is the initial error:
>
> *2012-02-21 09:01:42,015 INFO org.apache.hadoop.ipc.Server: Error
> register getProtocolVersion
> java.lang.**IllegalArgumentException: Duplicate
> metricsName:getProtocolVersion
>         at org.apache.hadoop.metrics.**util.MetricsRegistry.add(**
> MetricsRegistry.java:53)
>         at org.apache.hadoop.metrics.**util.MetricsTimeVaryingRate.<**
> init>(MetricsTimeVaryingRate.**java:89)
>         at org.apache.hadoop.metrics.**util.MetricsTimeVaryingRate.<**
> init>(MetricsTimeVaryingRate.**java:99)
>         at org.apache.hadoop.ipc.RPC$**Server.call(RPC.java:523)
>         at org.apache.hadoop.ipc.Server$**Handler$1.run(Server.java:959)
>         at org.apache.hadoop.ipc.Server$**Handler$1.run(Server.java:955)
>         at java.security.**AccessController.doPrivileged(**Native Method)
>         at javax.security.auth.Subject.**doAs(Subject.java:396)
>         at org.apache.hadoop.ipc.Server$**Handler.run(Server.java:953)
> *
> I’ve scoured the web searching for other instances of this error, but none
> of the hits were helpful, nor relevant to my setup. My hunch is that this
> is preventing the cluster from correctly initializing. I would have
> switched to a later version of Hadoop, but the Nutch v1.4 distribution I’m
> trying to run on top of Hadoop is, AFAIK, only compatible with Hadoop
> v0.20. I have included with this email all my hadoop config files
> (config.rar), in case you need to take a quick look. Below is my /etc/hosts
> configuration, in case the issue is with that. I believe this is a
> hadoop-specific issue, and not related to Nutch, hence am posting to the
> hadoop mailing list.
>
> *ETC/HOSTS:
> **127.0.0.1       localhost
> #127.0.1.1      bali** **
>
> # The following lines are desirable for IPv6 capable hosts
> ::1     localhost ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
>
> 192.168.1.21 bali
>
> **
> FILE-SYSTEM layout:**
> *Here's my filesystem layout. I've got all my hadoop configs pointing to
> folders under a root folder called */private/user/hadoop*, with the
> following permissions.
> *ls -l /private/user/
> *total 4
> drwxrwxrwx 7 user alt 4096 Feb 21 09:06 hadoop
>
> *ls -l /private/user/hadoop/
> *total 20
> drwxr-xr-x 5 user alt 4096 Feb 21 09:01 data
> drwxr-xr-x 3 user alt 4096 Feb 21 09:07 mapred
> drwxr-xr-x 4 user alt 4096 Feb 21 08:59 name
> drwxr-xr-x 2 user alt 4096 Feb 21 08:59 pids
> drwxr-xr-x 3 user alt 4096 Feb 21 09:01 tmp
>
> Shortly after the getProtocolVersion error above, I start seeing these
> errors in the namenode log:
> *2012-02-21 09:06:47,895 WARN org.mortbay.log: /getimage:
> java.io.IOException: GetImage failed. java.io.IOException: Server returned
> HTTP response code: 503 for URL:
> http://192.168.1.21:50090/getimage?getimage=1
>         at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
>         at
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:151)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.GetImageServlet.doGet(GetImageServlet.java:58)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>         at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
>         at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
>         at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>         at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>         at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>         at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
>         at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>         at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>         at org.mortbay.jetty.Server.handle(Server.java:324)
>         at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
>         at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
>         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
>         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
>         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
>         at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
>         at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
> *
> *The log files of JobTracker, TaskTracker and DataNode are all without
> any errors or exceptions at this point.*
>
> I also encounter issues whilst executing a Nutch job on this cluster, but
> I will wait to hear back about the above issues first before posting those
> issues, since they might get resolved in the process.
>
> Any suggestions would be greatly appreciated.
>
> Thanks in advance!
>
> Regards,
> Safdar
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message