hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Sautins <andy.saut...@returnpath.net>
Subject hadoop fsck through proxy...
Date Thu, 24 Sep 2009 18:59:23 GMT

   I looked in JIRA but didn't see this reported so I thought I'd see what this list thinks.
 We've been using SOCKS proxying to access a Hadoop cluster generally  using setup described
on the Couldera blog posting (http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/
).  This works great by setting  hadoop.rpc.socket.factory.class.default to org.apache.hadoop.net.SocksSocketFactory.
 Generally thinks work well ( hadoop dfs activity like -ls -rmr -cat ) all work fine.  The
one command that doesn't work is fsck.  Note the following command and error:

hadoop fsck /
Exception in thread "main" java.net.NoRouteToHostException: No route to host

   So looking at org.apache.hadoop.hdfs.tools.DFSck.java the connection is created using URLConnection,
so it makes sense why it wouldn't work since it doesn't seem to use the socket factory.

   So to me this seems like an issue.  Can someone please confirm?  If it is I'll add a JIRA.
 Happy to take a crack and making a change as well.  Unclear to me the easiest way to change.
 I haven't run across in the codebase code that uses hadoop.rpc.socket.factory.class.default
for HTTP connections.

   Any thoughts would be appreciated.



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message