hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsuyoshi Ozawa (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-3642) Hadoop2 yarn.resourcemanager.scheduler.address not loaded by RMProxy.java
Date Thu, 14 May 2015 00:14:59 GMT

     [ https://issues.apache.org/jira/browse/YARN-3642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tsuyoshi Ozawa updated YARN-3642:
---------------------------------
    Description: 
There is an issue with Hadoop 2.7.0 when in distributed operation the datanode is unable to
reach the yarn scheduler.  In our yarn-site.xml, we have defined this path to be:

{code}
   <property>
      <name>yarn.resourcemanager.scheduler.address</name>
      <value>qadoop-nn001.apsalar.com:8030</value>
   </property>
{code}

But when running an oozie job, the problem manifests when looking at the job logs for the
yarn container.
We see logs similar to the following showing the connection problem:

{quote}
Showing 4096 bytes. Click here for full log
[main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 64065
2015-05-13 17:49:33,930 INFO [main] org.mortbay.log: jetty-6.1.26
2015-05-13 17:49:33,971 INFO [main] org.mortbay.log: Extract jar:file:/opt/local/hadoop/hadoop-2.7.0/share/hadoop/yarn/hadoop-yarn-common-2.7.0.jar!/webapps/mapreduce
to /var/tmp/Jetty_0_0_0_0_64065_mapreduce____.1ayyhk/webapp
2015-05-13 17:49:34,234 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:64065
2015-05-13 17:49:34,234 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce
started at 64065
2015-05-13 17:49:34,645 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp
guice modules
2015-05-13 17:49:34,651 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue
class java.util.concurrent.LinkedBlockingQueue
2015-05-13 17:49:34,652 INFO [Socket Reader #1 for port 38927] org.apache.hadoop.ipc.Server:
Starting Socket Reader #1 for port 38927
2015-05-13 17:49:34,660 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server
Responder: starting
2015-05-13 17:49:34,660 INFO [IPC Server listener on 38927] org.apache.hadoop.ipc.Server:
IPC Server listener on 38927: starting
2015-05-13 17:49:34,700 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
nodeBlacklistingEnabled:true
2015-05-13 17:49:34,700 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
maxTaskFailuresPerNode is 3
2015-05-13 17:49:34,700 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
blacklistDisablePercent is 33
2015-05-13 17:49:34,775 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager
at /0.0.0.0:8030
2015-05-13 17:49:35,820 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:36,821 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:37,823 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:38,824 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:39,825 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:40,826 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:41,827 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:42,828 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:43,829 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:44,830 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
{quote}


To prove the problem, we have patched the file:
{code}
hadoop-2.7.0/src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
{code}

so that we now "inject" the yarn.resourcemanager.scheduler.address directly into the configuration.

The modified code looks like this:

{code}
  @Private
  protected static <T> T createRMProxy(final Configuration configuration,
      final Class<T> protocol, RMProxy instance) throws IOException {
    YarnConfiguration conf = (configuration instanceof YarnConfiguration)
        ? (YarnConfiguration) configuration
        : new YarnConfiguration(configuration);
    LOG.info("LEE: changing the conf to include yarn.resourcemanager.scheduler.address at
10.1.26.1");
    conf.set("yarn.resourcemanager.scheduler.address", "10.1.26.1");
    RetryPolicy retryPolicy = createRetryPolicy(conf);
    if (HAUtil.isHAEnabled(conf)) {
      RMFailoverProxyProvider<T> provider =
          instance.createRMFailoverProxyProvider(conf, protocol);
      return (T) RetryProxy.create(protocol, provider, retryPolicy);
    } else {
      InetSocketAddress rmAddress = instance.getRMAddress(conf, protocol);
      LOG.info("LEE: Connecting to ResourceManager at " + rmAddress);
      T proxy = RMProxy.<T>getProxy(conf, protocol, rmAddress);
      return (T) RetryProxy.create(protocol, proxy, retryPolicy);
    }
  }
{code}

  was:
There is an issue with Hadoop 2.7.0 when in distributed operation the datanode is unable to
reach the yarn scheduler.  In our yarn-site.xml, we have defined this path to be:

   <property>
      <name>yarn.resourcemanager.scheduler.address</name>
      <value>qadoop-nn001.apsalar.com:8030</value>
   </property>

But when running an oozie job, the problem manifests when looking at the job logs for the
yarn container.
We see logs similar to the following showing the connection problem:

Showing 4096 bytes. Click here for full log
[main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 64065
2015-05-13 17:49:33,930 INFO [main] org.mortbay.log: jetty-6.1.26
2015-05-13 17:49:33,971 INFO [main] org.mortbay.log: Extract jar:file:/opt/local/hadoop/hadoop-2.7.0/share/hadoop/yarn/hadoop-yarn-common-2.7.0.jar!/webapps/mapreduce
to /var/tmp/Jetty_0_0_0_0_64065_mapreduce____.1ayyhk/webapp
2015-05-13 17:49:34,234 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:64065
2015-05-13 17:49:34,234 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce
started at 64065
2015-05-13 17:49:34,645 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp
guice modules
2015-05-13 17:49:34,651 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue
class java.util.concurrent.LinkedBlockingQueue
2015-05-13 17:49:34,652 INFO [Socket Reader #1 for port 38927] org.apache.hadoop.ipc.Server:
Starting Socket Reader #1 for port 38927
2015-05-13 17:49:34,660 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server
Responder: starting
2015-05-13 17:49:34,660 INFO [IPC Server listener on 38927] org.apache.hadoop.ipc.Server:
IPC Server listener on 38927: starting
2015-05-13 17:49:34,700 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
nodeBlacklistingEnabled:true
2015-05-13 17:49:34,700 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
maxTaskFailuresPerNode is 3
2015-05-13 17:49:34,700 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
blacklistDisablePercent is 33
2015-05-13 17:49:34,775 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager
at /0.0.0.0:8030
2015-05-13 17:49:35,820 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:36,821 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:37,823 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:38,824 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:39,825 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:40,826 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:41,827 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:42,828 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:43,829 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2015-05-13 17:49:44,830 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)


To prove the problem, we have patched the file:
hadoop-2.7.0/src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java

so that we now "inject" the yarn.resourcemanager.scheduler.address directly into the configuration.

The modified code looks like this:

  @Private
  protected static <T> T createRMProxy(final Configuration configuration,
      final Class<T> protocol, RMProxy instance) throws IOException {
    YarnConfiguration conf = (configuration instanceof YarnConfiguration)
        ? (YarnConfiguration) configuration
        : new YarnConfiguration(configuration);
    LOG.info("LEE: changing the conf to include yarn.resourcemanager.scheduler.address at
10.1.26.1");
    conf.set("yarn.resourcemanager.scheduler.address", "10.1.26.1");
    RetryPolicy retryPolicy = createRetryPolicy(conf);
    if (HAUtil.isHAEnabled(conf)) {
      RMFailoverProxyProvider<T> provider =
          instance.createRMFailoverProxyProvider(conf, protocol);
      return (T) RetryProxy.create(protocol, provider, retryPolicy);
    } else {
      InetSocketAddress rmAddress = instance.getRMAddress(conf, protocol);
      LOG.info("LEE: Connecting to ResourceManager at " + rmAddress);
      T proxy = RMProxy.<T>getProxy(conf, protocol, rmAddress);
      return (T) RetryProxy.create(protocol, proxy, retryPolicy);
    }
  }


> Hadoop2 yarn.resourcemanager.scheduler.address not loaded by RMProxy.java
> -------------------------------------------------------------------------
>
>                 Key: YARN-3642
>                 URL: https://issues.apache.org/jira/browse/YARN-3642
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.7.0
>         Environment: yarn-site.xml:
> <configuration>
>    <property>
>       <name>yarn.nodemanager.aux-services</name>
>       <value>mapreduce_shuffle</value>
>    </property>
>    <property>
>       <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
>       <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>    </property>
>    <property>
>       <name>yarn.resourcemanager.hostname</name>
>       <value>qadoop-nn001.apsalar.com</value>
>    </property>
>    <property>
>       <name>yarn.resourcemanager.scheduler.address</name>
>       <value>qadoop-nn001.apsalar.com:8030</value>
>    </property>
>    <property>
>       <name>yarn.resourcemanager.address</name>
>       <value>qadoop-nn001.apsalar.com:8032</value>
>    </property>
>    <property>
>       <name>yarn.resourcemanager.webap.address</name>
>       <value>qadoop-nn001.apsalar.com:8088</value>
>    </property>
>    <property>
>       <name>yarn.resourcemanager.resource-tracker.address</name>
>       <value>qadoop-nn001.apsalar.com:8031</value>
>    </property>
>    <property>
>       <name>yarn.resourcemanager.admin.address</name>
>       <value>qadoop-nn001.apsalar.com:8033</value>
>    </property>
>    <property>
>       <name>yarn.log-aggregation-enable</name>
>       <value>true</value>
>    </property>
>    <property>
>       <description>Where to aggregate logs to.</description>
>       <name>yarn.nodemanager.remote-app-log-dir</name>
>       <value>/var/log/hadoop/apps</value>
>    </property>
>    <property>
>       <name>yarn.web-proxy.address</name>
>       <value>qadoop-nn001.apsalar.com:8088</value>
>    </property>
> </configuration>
> core-site.xml:
> <configuration>
>    <property>
>       <name>fs.defaultFS</name>
>       <value>hdfs://qadoop-nn001.apsalar.com</value>
>    </property>
>    <property>
>       <name>hadoop.proxyuser.hdfs.hosts</name>
>       <value>*</value>
>    </property>
>    <property>
>       <name>hadoop.proxyuser.hdfs.groups</name>
>       <value>*</value>
>    </property>
> </configuration>
> hdfs-site.xml:
> <configuration>
>    <property>
>       <name>dfs.replication</name>
>       <value>2</value>
>    </property>
>    <property>
>       <name>dfs.namenode.name.dir</name>
>       <value>file:/hadoop/nn</value>
>    </property>
>    <property>
>       <name>dfs.datanode.data.dir</name>
>       <value>file:/hadoop/dn/dfs</value>
>    </property>
>    <property>
>       <name>dfs.http.address</name>
>       <value>qadoop-nn001.apsalar.com:50070</value>
>    </property>
>    <property>
>       <name>dfs.secondary.http.address</name>
>       <value>qadoop-nn002.apsalar.com:50090</value>
>    </property>
> </configuration>
> mapred-site.xml:
> <configuration>
>    <property> 
>       <name>mapred.job.tracker</name> 
>       <value>qadoop-nn001.apsalar.com:8032</value> 
>    </property>
>    <property>
>       <name>mapreduce.framework.name</name>
>       <value>yarn</value>
>    </property>
>    <property>
>       <name>mapreduce.jobhistory.address</name>
>       <value>qadoop-nn001.apsalar.com:10020</value>
>       <description>the JobHistoryServer address.</description>
>    </property>
>    <property>  
>       <name>mapreduce.jobhistory.webapp.address</name>  
>       <value>qadoop-nn001.apsalar.com:19888</value>  
>       <description>the JobHistoryServer web address</description>
>    </property>
> </configuration>
> hbase-site.xml:
> <configuration>
>     <property> 
>         <name>hbase.master</name> 
>         <value>qadoop-nn001.apsalar.com:60000</value> 
>     </property> 
>     <property> 
>         <name>hbase.rootdir</name> 
>         <value>hdfs://qadoop-nn001.apsalar.com:8020/hbase</value> 
>     </property> 
>     <property> 
>         <name>hbase.cluster.distributed</name> 
>         <value>true</value> 
>     </property> 
>     <property>
>         <name>hbase.zookeeper.property.dataDir</name>
>         <value>/opt/local/zookeeper</value>
>     </property> 
>     <property>
>         <name>hbase.zookeeper.property.clientPort</name>
>         <value>2181</value> 
>     </property>
>     <property> 
>         <name>hbase.zookeeper.quorum</name> 
>         <value>qadoop-nn001.apsalar.com</value> 
>     </property> 
>     <property> 
>         <name>zookeeper.session.timeout</name> 
>         <value>180000</value> 
>     </property> 
> </configuration>
>            Reporter: Lee Hounshell
>
> There is an issue with Hadoop 2.7.0 when in distributed operation the datanode is unable
to reach the yarn scheduler.  In our yarn-site.xml, we have defined this path to be:
> {code}
>    <property>
>       <name>yarn.resourcemanager.scheduler.address</name>
>       <value>qadoop-nn001.apsalar.com:8030</value>
>    </property>
> {code}
> But when running an oozie job, the problem manifests when looking at the job logs for
the yarn container.
> We see logs similar to the following showing the connection problem:
> {quote}
> Showing 4096 bytes. Click here for full log
> [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 64065
> 2015-05-13 17:49:33,930 INFO [main] org.mortbay.log: jetty-6.1.26
> 2015-05-13 17:49:33,971 INFO [main] org.mortbay.log: Extract jar:file:/opt/local/hadoop/hadoop-2.7.0/share/hadoop/yarn/hadoop-yarn-common-2.7.0.jar!/webapps/mapreduce
to /var/tmp/Jetty_0_0_0_0_64065_mapreduce____.1ayyhk/webapp
> 2015-05-13 17:49:34,234 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:64065
> 2015-05-13 17:49:34,234 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce
started at 64065
> 2015-05-13 17:49:34,645 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered
webapp guice modules
> 2015-05-13 17:49:34,651 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue
class java.util.concurrent.LinkedBlockingQueue
> 2015-05-13 17:49:34,652 INFO [Socket Reader #1 for port 38927] org.apache.hadoop.ipc.Server:
Starting Socket Reader #1 for port 38927
> 2015-05-13 17:49:34,660 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC
Server Responder: starting
> 2015-05-13 17:49:34,660 INFO [IPC Server listener on 38927] org.apache.hadoop.ipc.Server:
IPC Server listener on 38927: starting
> 2015-05-13 17:49:34,700 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
nodeBlacklistingEnabled:true
> 2015-05-13 17:49:34,700 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
maxTaskFailuresPerNode is 3
> 2015-05-13 17:49:34,700 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
blacklistDisablePercent is 33
> 2015-05-13 17:49:34,775 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting
to ResourceManager at /0.0.0.0:8030
> 2015-05-13 17:49:35,820 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
> 2015-05-13 17:49:36,821 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
> 2015-05-13 17:49:37,823 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
> 2015-05-13 17:49:38,824 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
> 2015-05-13 17:49:39,825 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
> 2015-05-13 17:49:40,826 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
> 2015-05-13 17:49:41,827 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
> 2015-05-13 17:49:42,828 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
> 2015-05-13 17:49:43,829 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
> 2015-05-13 17:49:44,830 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
> {quote}
> To prove the problem, we have patched the file:
> {code}
> hadoop-2.7.0/src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
> {code}
> so that we now "inject" the yarn.resourcemanager.scheduler.address directly into the
configuration.
> The modified code looks like this:
> {code}
>   @Private
>   protected static <T> T createRMProxy(final Configuration configuration,
>       final Class<T> protocol, RMProxy instance) throws IOException {
>     YarnConfiguration conf = (configuration instanceof YarnConfiguration)
>         ? (YarnConfiguration) configuration
>         : new YarnConfiguration(configuration);
>     LOG.info("LEE: changing the conf to include yarn.resourcemanager.scheduler.address
at 10.1.26.1");
>     conf.set("yarn.resourcemanager.scheduler.address", "10.1.26.1");
>     RetryPolicy retryPolicy = createRetryPolicy(conf);
>     if (HAUtil.isHAEnabled(conf)) {
>       RMFailoverProxyProvider<T> provider =
>           instance.createRMFailoverProxyProvider(conf, protocol);
>       return (T) RetryProxy.create(protocol, provider, retryPolicy);
>     } else {
>       InetSocketAddress rmAddress = instance.getRMAddress(conf, protocol);
>       LOG.info("LEE: Connecting to ResourceManager at " + rmAddress);
>       T proxy = RMProxy.<T>getProxy(conf, protocol, rmAddress);
>       return (T) RetryProxy.create(protocol, proxy, retryPolicy);
>     }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message