flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niels Basjes <Ni...@basjes.nl>
Subject Re: HBase config settings go missing within Yarn.
Date Thu, 26 Oct 2017 08:35:01 GMT
I have an idea how we can reduce the impact this class of problem.
If we can detect that we are running in a distributed environment then in
order to use HBase you MUST have an hbase-site.xml

I'll see if I can make a proof of concept.

Niels

On Wed, Oct 25, 2017 at 11:27 AM, Till Rohrmann <trohrmann@apache.org>
wrote:

> Hi Niels,
>
> good to see that you solved your problem.
>
> I’m not entirely sure how Pig does it, but I assume that there must be
> some kind of HBase support where the HBase specific files are explicitly
> send to the cluster or that it copies the environment variables. For Flink
> supporting this kind of behaviour is not really feasible because there are
> simply too many potential projects to support out there.
>
> The Flink idiomatic way would be either to read the config on the client,
> put it in the closure of the operator and then send it in serialized form
> to the cluster. Or you set the correct environment variables to start your
> Flink job cluster with by using env.java.opts or extending the class path
> information as you did.
>
> The following code shows the closure approach.
>
> public class Main {
>   private static final Logger LOG = LoggerFactory.getLogger(Main.class);
>
>   public static void main(String[] args) throws Exception {
>     printZookeeperConfig();
>     final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment().setParallelism(1);
>     env.createInput(new HBaseSource(HBaseConfiguration.create())).print();
>     env.execute("HBase config problem");
>   }
>
>   public static void printZookeeperConfig() {
>     String zookeeper = HBaseConfiguration.create().get("hbase.zookeeper.quorum");
>     LOG.info("----> Loading HBaseConfiguration: Zookeeper = {}", zookeeper);
>   }
>
>   public static class HBaseSource extends AbstractTableInputFormat<String> {
>
>     // HBase configuration read on the client
>     private final org.apache.hadoop.conf.Configuration hConf;
>
>     public HBaseSource(org.apache.hadoop.conf.Configuration hConf) {
>       this.hConf = Preconditions.checkNotNull(hConf);
>     }
>
>     @Override
>     public void configure(org.apache.flink.configuration.Configuration parameters) {
>       table = createTable();
>       if (table != null) {
>         scan = getScanner();
>       }
>     }
>
>     private HTable createTable() {
>       printZookeeperConfig();
>
>       try {
>         return new HTable(hConf, getTableName());
>       } catch (Exception e) {
>         LOG.error("Error instantiating a new HTable instance", e);
>       }
>       return null;
>     }
>
>     @Override
>     public String getTableName() {
>       return "bugs:flink";
>     }
>
>     @Override
>     protected String mapResultToOutType(Result result) {
>       return new String(result.getFamilyMap("v".getBytes(UTF_8)).get("column".getBytes(UTF_8)));
>     }
>
>     @Override
>     protected Scan getScanner() {
>       return new Scan();
>     }
>   }
> }
>
> Cheers,
> Till
> ​
>
> On Tue, Oct 24, 2017 at 11:51 AM, Niels Basjes <Niels@basjes.nl> wrote:
>
>> I changed my cluster config (on all nodes) to include the HBase config
>> dir in the classpath.
>> Now everything works as expected.
>>
>> This may very well be a misconfiguration of my cluster.
>> How ever ...
>> My current assesment:
>> Tools like Pig use the HBase config which has been specified on the LOCAL
>> machine. This allows running on a cluster and the HBase is not locally
>> defined.
>> Apparently Flink currently uses the HBase config which has been specified
>> on the REMOTE machine. This limits jobs to ONLY have the HBase that is
>> defined on the cluster.
>>
>> At this point I'm unsure which is the right approach.
>>
>> Niels Basjes
>>
>> On Tue, Oct 24, 2017 at 11:29 AM, Niels Basjes <Niels@basjes.nl> wrote:
>>
>>> Minor correction: The HBase jar files are on the classpath, just in a
>>> different order.
>>>
>>> On Tue, Oct 24, 2017 at 11:18 AM, Niels Basjes <Niels@basjes.nl> wrote:
>>>
>>>> I did some more digging.
>>>>
>>>> I added extra code to print both the environment variables and the
>>>> classpath that is used by the HBaseConfiguration to load the resource files.
>>>> I call this both locally and during startup of the job (i.e. these logs
>>>> arrive in the jobmanager.log on the cluster)
>>>>
>>>> Summary of that I found locally:
>>>>
>>>> Environment
>>>> 2017-10-24 08:50:15,612 INFO  com.bol.bugreports.Main
>>>>                      - HADOOP_CONF_DIR = /etc/hadoop/conf/
>>>> 2017-10-24 08:50:15,613 INFO  com.bol.bugreports.Main
>>>>                      - HBASE_CONF_DIR = /etc/hbase/conf/
>>>> 2017-10-24 08:50:15,613 INFO  com.bol.bugreports.Main
>>>>                      - FLINK_CONF_DIR = /usr/local/flink-1.3.2/conf
>>>> 2017-10-24 08:50:15,613 INFO  com.bol.bugreports.Main
>>>>                      - HIVE_CONF_DIR = /etc/hive/conf/
>>>> 2017-10-24 08:50:15,613 INFO  com.bol.bugreports.Main
>>>>                      - YARN_CONF_DIR = /etc/hadoop/conf/
>>>>
>>>> ClassPath
>>>> 2017-10-24 08:50:15,614 INFO  com.bol.bugreports.Main
>>>>                      - --> HBaseConfiguration: URLClassLoader =
>>>> sun.misc.Launcher$AppClassLoader@1b6d3586
>>>> 2017-10-24 08:50:15,614 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/local/flink-1.3.2/li
>>>> b/flink-python_2.11-1.3.2.jar
>>>> 2017-10-24 08:50:15,614 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/local/flink-1.3.2/li
>>>> b/flink-shaded-hadoop2-uber-1.3.2.jar
>>>> 2017-10-24 08:50:15,614 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/local/flink-1.3.2/li
>>>> b/joda-time-2.9.1.jar
>>>> 2017-10-24 08:50:15,614 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/local/flink-1.3.2/li
>>>> b/log4j-1.2.17.jar
>>>> 2017-10-24 08:50:15,614 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/local/flink-1.3.2/li
>>>> b/slf4j-log4j12-1.7.7.jar
>>>> 2017-10-24 08:50:15,614 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/local/flink-1.3.2/li
>>>> b/flink-dist_2.11-1.3.2.jar
>>>> 2017-10-24 08:50:15,614 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/home/nbasjes/FlinkHBaseC
>>>> onnect/
>>>> 2017-10-24 08:50:15,614 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/etc/hadoop/conf/
>>>> 2017-10-24 08:50:15,614 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/etc/hadoop/conf/
>>>> 2017-10-24 08:50:15,615 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/etc/hbase/conf/
>>>> 2017-10-24 08:50:15,615 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/lib/jvm/java-1.8.0-o
>>>> penjdk-1.8.0.141-2.b16.el6_9.x86_64/lib/tools.jar
>>>> 2017-10-24 08:50:15,615 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hba
>>>> se/
>>>> 2017-10-24 08:50:15,615 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hba
>>>> se/lib/activation-1.1.jar
>>>> 2017-10-24 08:50:15,615 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hba
>>>> se/lib/aopalliance-1.0.jar
>>>> 2017-10-24 08:50:15,615 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hba
>>>> se/lib/apacheds-i18n-2.0.0-M15.jar
>>>> 2017-10-24 08:50:15,615 INFO  com.bol.bugreports.Main
>>>>                      - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hba
>>>> se/lib/apacheds-kerberos-codec-2.0.0-M
>>>> ...
>>>>
>>>>
>>>>
>>>> On the cluster node in the jobmanager.log:
>>>>
>>>> ENVIRONMENT
>>>> 2017-10-24 10:50:19,971 INFO  com.bol.bugreports.Main                   
                   - HADOOP_CONF_DIR = /usr/hdp/current/hadoop-yarn-nodemanager/../hadoop/conf
>>>> 2017-10-24 10:50:19,971 INFO  com.bol.bugreports.Main                   
                   - TEZ_CONF_DIR = /etc/tez/conf
>>>> 2017-10-24 10:50:19,973 INFO  com.bol.bugreports.Main                   
                   - YARN_CONF_DIR = /usr/hdp/current/hadoop-yarn-nodemanager/../hadoop/conf
>>>> 2017-10-24 10:50:19,973 INFO  com.bol.bugreports.Main                   
                   - LOG_DIRS = /var/log/hadoop-yarn/containers/application_1503304315746_0062/container_1503304315746_0062_01_000001
>>>> 2017-10-24 10:50:19,973 INFO  com.bol.bugreports.Main                   
                   - HADOOP_YARN_HOME = /usr/hdp/2.3.4.0-3485/hadoop-yarn
>>>> 2017-10-24 10:50:19,974 INFO  com.bol.bugreports.Main                   
                   - HADOOP_HOME = /usr/hdp/2.3.4.0-3485/hadoop
>>>> 2017-10-24 10:50:19,975 INFO  com.bol.bugreports.Main                   
                   - HDP_VERSION = 2.3.4.0-3485
>>>>
>>>> And the classpath:
>>>>
>>>> 2017-10-24 10:50:19,977 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/15/flink-hbase-connect-1.0-SNAPSHOT.jar
>>>> 2017-10-24 10:50:19,977 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/13/lib/flink-dist_2.11-1.3.2.jar
>>>> 2017-10-24 10:50:19,977 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/13/lib/flink-python_2.11-1.3.2.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/13/lib/flink-shaded-hadoop2-uber-1.3.2.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/13/lib/joda-time-2.9.1.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/13/lib/log4j-1.2.17.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/13/lib/slf4j-log4j12-1.7.7.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/14/log4j.properties
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/11/logback.xml
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/16/flink-dist_2.11-1.3.2.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/filecache/10/flink-conf.yaml
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/nbasjes/appcache/application_1503304315746_0062/container_1503304315746_0062_01_000001/
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/etc/hadoop/conf/
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hadoop/hadoop-nfs-2.7.1.2.3.4.0-3485.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hadoop/hadoop-common-2.7.1.2.3.4.0-3485-tests.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hadoop/hadoop-common-2.7.1.2.3.4.0-3485.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hadoop/hadoop-annotations-2.7.1.2.3.4.0-3485.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hadoop/hadoop-auth-2.7.1.2.3.4.0-3485.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hadoop/hadoop-azure-2.7.1.2.3.4.0-3485.jar
>>>> 2017-10-24 10:50:19,978 INFO  com.bol.bugreports.Main                   
                   - ----> ClassPath = file:/usr/hdp/2.3.4.0-3485/hadoop/hadoop-aws-2.7.1.2.3.4.0-3485.jar
>>>>
>>>>
>>>> So apparently everything HBase that was specified clientside is missing
>>>> when the task is running on my cluster.
>>>>
>>>> The thing is that when running for example a Pig script I get
>>>> everything perfectly fine on this cluster as it is configured right now.
>>>> Also the config 'shouldn't' (I think) need anything different because
>>>> this application only needs the HBase client (Jar, packaged into
>>>> application) and the HBase zookeeper settings (present on the machine where
>>>> it is started).
>>>>
>>>> Niels Basjes
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Oct 23, 2017 at 10:23 AM, Piotr Nowojski <
>>>> piotr@data-artisans.com> wrote:
>>>>
>>>>> Till do you have some idea what is going on? I do not see any
>>>>> meaningful difference between Niels code and HBaseWriteStreamExample.java.
>>>>> There is also a very similar issue on mailing list as well: “Flink
can't
>>>>> read hdfs namenode logical url”
>>>>>
>>>>> Piotrek
>>>>>
>>>>> On 22 Oct 2017, at 12:56, Niels Basjes <Niels@basjes.nl> wrote:
>>>>>
>>>>> HI,
>>>>>
>>>>> Yes, on all nodes the the same /etc/hbase/conf/hbase-site.xml that
>>>>> contains the correct settings for hbase to find zookeeper.
>>>>> That is why adding that files as an additional resource to the
>>>>> configuration works.
>>>>> I have created a very simple project that reproduces the problem on my
>>>>> setup:
>>>>> https://github.com/nielsbasjes/FlinkHBaseConnectProblem
>>>>>
>>>>> Niels Basjes
>>>>>
>>>>>
>>>>> On Fri, Oct 20, 2017 at 6:54 PM, Piotr Nowojski <
>>>>> piotr@data-artisans.com> wrote:
>>>>>
>>>>>> Is this /etc/hbase/conf/hbase-site.xml file is present on all of
the
>>>>>> machines? If yes, could you share your code?
>>>>>>
>>>>>> On 20 Oct 2017, at 16:29, Niels Basjes <Niels@basjes.nl> wrote:
>>>>>>
>>>>>> I look at the logfiles from the Hadoop Yarn webinterface. I.e.
>>>>>> actually looking in the jobmanager.log of the container running the
Flink
>>>>>> task.
>>>>>> That is where I was able to find these messages .
>>>>>>
>>>>>> I do the
>>>>>>  hbaseConfig.addResource(new Path("file:/etc/hbase/conf/hba
>>>>>> se-site.xml"));
>>>>>> in all places directly after the  HBaseConfiguration.create();
>>>>>> That way I simply force the task to look on the actual Hadoop node
>>>>>> for the same file it already loaded locally.
>>>>>>
>>>>>> The reason I'm suspecting Flink is because the clientside part of
the
>>>>>> Flink application does have the right setting and the task/job actually
>>>>>> running in the cluster does not have the same settings.
>>>>>> So it seems in the transition into the cluster the application does
>>>>>> not copy everything it has available locally for some reason.
>>>>>>
>>>>>> There is a very high probability I did something wrong, I'm just
not
>>>>>> seeing it at this moment.
>>>>>>
>>>>>> Niels
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Oct 20, 2017 at 2:53 PM, Piotr Nowojski <
>>>>>> piotr@data-artisans.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> What do you mean by saying:
>>>>>>>
>>>>>>> When I open the logfiles on the Hadoop cluster I see this:
>>>>>>>
>>>>>>>
>>>>>>> The error doesn’t come from Flink? Where do you execute
>>>>>>>
>>>>>>> hbaseConfig.addResource(new Path("file:/etc/hbase/conf/hba
>>>>>>> se-site.xml"));
>>>>>>>
>>>>>>> ?
>>>>>>>
>>>>>>> To me it seems like it is a problem with misconfigured HBase
and not
>>>>>>> something related to Flink.
>>>>>>>
>>>>>>> Piotrek
>>>>>>>
>>>>>>> On 20 Oct 2017, at 13:44, Niels Basjes <Niels@basjes.nl>
wrote:
>>>>>>>
>>>>>>> To facilitate you guys helping me I put this test project on
github:
>>>>>>> https://github.com/nielsbasjes/FlinkHBaseConnectProblem
>>>>>>>
>>>>>>> Niels Basjes
>>>>>>>
>>>>>>> On Fri, Oct 20, 2017 at 1:32 PM, Niels Basjes <Niels@basjes.nl>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Ik have a Flink 1.3.2 application that I want to run on a
Hadoop
>>>>>>>> yarn cluster where I need to connect to HBase.
>>>>>>>>
>>>>>>>> What I have:
>>>>>>>>
>>>>>>>> In my environment:
>>>>>>>> HADOOP_CONF_DIR=/etc/hadoop/conf/
>>>>>>>> HBASE_CONF_DIR=/etc/hbase/conf/
>>>>>>>> HIVE_CONF_DIR=/etc/hive/conf/
>>>>>>>> YARN_CONF_DIR=/etc/hadoop/conf/
>>>>>>>>
>>>>>>>> In /etc/hbase/conf/hbase-site.xml I have correctly defined
the
>>>>>>>> zookeeper hosts for HBase.
>>>>>>>>
>>>>>>>> My test code is this:
>>>>>>>>
>>>>>>>> public class Main {
>>>>>>>>   private static final Logger LOG = LoggerFactory.getLogger(Main.class);
>>>>>>>>
>>>>>>>>   public static void main(String[] args) throws Exception
{
>>>>>>>>     printZookeeperConfig();
>>>>>>>>     final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment().setParallelism(1);
>>>>>>>>     env.createInput(new HBaseSource()).print();
>>>>>>>>     env.execute("HBase config problem");
>>>>>>>>   }
>>>>>>>>
>>>>>>>>   public static void printZookeeperConfig() {
>>>>>>>>     String zookeeper = HBaseConfiguration.create().get("hbase.zookeeper.quorum");
>>>>>>>>     LOG.info("----> Loading HBaseConfiguration: Zookeeper
= {}", zookeeper);
>>>>>>>>   }
>>>>>>>>
>>>>>>>>   public static class HBaseSource extends AbstractTableInputFormat<String>
{
>>>>>>>>     @Override
>>>>>>>>     public void configure(org.apache.flink.configuration.Configuration
parameters) {
>>>>>>>>       table = createTable();
>>>>>>>>       if (table != null) {
>>>>>>>>         scan = getScanner();
>>>>>>>>       }
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     private HTable createTable() {
>>>>>>>>       LOG.info("Initializing HBaseConfiguration");
>>>>>>>>       // Uses files found in the classpath
>>>>>>>>       org.apache.hadoop.conf.Configuration hConf = HBaseConfiguration.create();
>>>>>>>>       printZookeeperConfig();
>>>>>>>>
>>>>>>>>       try {
>>>>>>>>         return new HTable(hConf, getTableName());
>>>>>>>>       } catch (Exception e) {
>>>>>>>>         LOG.error("Error instantiating a new HTable instance",
e);
>>>>>>>>       }
>>>>>>>>       return null;
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     @Override
>>>>>>>>     public String getTableName() {
>>>>>>>>       return "bugs:flink";
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     @Override
>>>>>>>>     protected String mapResultToOutType(Result result) {
>>>>>>>>       return new String(result.getFamilyMap("v".getBytes(UTF_8)).get("column".getBytes(UTF_8)));
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     @Override
>>>>>>>>     protected Scan getScanner() {
>>>>>>>>       return new Scan();
>>>>>>>>     }
>>>>>>>>   }
>>>>>>>>
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> I run this application with this command on my Yarn cluster
(note:
>>>>>>>> first starting a yarn-cluster and then submitting the job
yields the same
>>>>>>>> result).
>>>>>>>>
>>>>>>>> flink \
>>>>>>>>     run \
>>>>>>>>     -m yarn-cluster \
>>>>>>>>     --yarncontainer 1 \
>>>>>>>>     --yarnname "Flink on Yarn HBase problem" \
>>>>>>>>     --yarnslots                     1     \
>>>>>>>>     --yarnjobManagerMemory          4000  \
>>>>>>>>     --yarntaskManagerMemory         4000  \
>>>>>>>>     --yarnstreaming                       \
>>>>>>>>     target/flink-hbase-connect-1.0-SNAPSHOT.jar
>>>>>>>>
>>>>>>>> Now in the client side logfile /usr/local/flink-1.3.2/log/flink--client-80d2d21b10e0.log
I see
>>>>>>>>
>>>>>>>> 1) Classpath actually contains /etc/hbase/conf/ both near
the start and at the end.
>>>>>>>>
>>>>>>>> 2) The zookeeper settings of my experimental environent have
been picked up by the software
>>>>>>>>
>>>>>>>> 2017-10-20 11:17:23,973 INFO  com.bol.bugreports.Main   
                                   - ----> Loading HBaseConfiguration: Zookeeper = node1.kluster.local.nl.bol.com:2181,node2.kluster.local.nl.bol.com:2181,node3.kluster.local.nl.bol.com:2181
>>>>>>>>
>>>>>>>>
>>>>>>>> When I open the logfiles on the Hadoop cluster I see this:
>>>>>>>>
>>>>>>>> 2017-10-20 13:17:33,250 INFO  com.bol.bugreports.Main   
                                   - ----> Loading HBaseConfiguration: Zookeeper = *localhost*
>>>>>>>>
>>>>>>>>
>>>>>>>> and as a consequence
>>>>>>>>
>>>>>>>> 2017-10-20 13:17:33,368 INFO  org.apache.zookeeper.ClientCnxn
                              - Opening socket connection to server localhost.localdomain/127.0.0.1:2181
>>>>>>>>
>>>>>>>> 2017-10-20 13:17:33,369 WARN  org.apache.zookeeper.ClientCnxn
                              - Session 0x0 for server null, unexpected error, closing socket
connection and attempting reconnect
>>>>>>>>
>>>>>>>> java.net.ConnectException: Connection refused
>>>>>>>>
>>>>>>>> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>>>
>>>>>>>> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>>>>>>>>
>>>>>>>> 	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>>>>>>>>
>>>>>>>> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
>>>>>>>>
>>>>>>>> 2017-10-20 13:17:33,475 WARN  org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper
       - Possibly transient ZooKeeper, quorum=localhost:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The value 'localhost:2181' has been defined within the HBase
jar in
>>>>>>>> the hbase-default.xml as the default value for the zookeeper
nodes.
>>>>>>>>
>>>>>>>> As a workaround I currently put this extra line in my code
which I
>>>>>>>> know is nasty but "works on my cluster"
>>>>>>>>
>>>>>>>> hbaseConfig.addResource(new Path("file:/etc/hbase/conf/hbase-site.xml"));
>>>>>>>>
>>>>>>>>
>>>>>>>> What am I doing wrong?
>>>>>>>>
>>>>>>>> What is the right way to fix this?
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards / Met vriendelijke groeten,
>>>>>>>>
>>>>>>>> Niels Basjes
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards / Met vriendelijke groeten,
>>>>>>>
>>>>>>> Niels Basjes
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards / Met vriendelijke groeten,
>>>>>>
>>>>>> Niels Basjes
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards / Met vriendelijke groeten,
>>>>>
>>>>> Niels Basjes
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards / Met vriendelijke groeten,
>>>>
>>>> Niels Basjes
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards / Met vriendelijke groeten,
>>>
>>> Niels Basjes
>>>
>>
>>
>>
>> --
>> Best regards / Met vriendelijke groeten,
>>
>> Niels Basjes
>>
>
>


-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Mime
View raw message