kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Scott <phil.sc...@pricespider.com>
Subject RE: Kylin 2.3.1 failing cube build on HDFS High-Availability cluster
Date Tue, 13 Nov 2018 17:56:51 GMT
Thanks Lijun,

I found an answer that fixed my problem.  Apparently, Ambari configuration of the HDFS HA
mode configured all of the core Horton packaged modules correctly for them to work in HA mode
(including HIVE).  Hive would no longer accept a hdfs://<hostname>:<port>/ syntax
in HA mode and was expecting hdfs://<ha_cluster_name>/ format.

The issue was that the Ambari 2.6 platform was still using a deprecated setting in core-site.xml
named “fs.defaultFS” pointing to my new HA cluster “hdfs://bdp01”.  However, apparently
Kylin has this deprecated and is expecting “fs.default.name” to hold this setting.  So,
I went into Ambari HDFS advanced configs, under the “Custom core-site” section and added
a custom property for:  “fs.default.name=hdfs://bdp01” and rolled out the configs via
Ambari.  Kylin was then able to find the updated core-site.xml file in ${KYLIN_HOME}/Hadoop-conf/core-site.xml
(a symbolic link to the Ambari-configured file).  Once this was done, Kylin was once again
able to build cubes!

-Phil

From: Lijun Cao <641507577@qq.com>
Sent: Monday, November 12, 2018 6:46 PM
To: user@kylin.apache.org
Subject: Re: Kylin 2.3.1 failing cube build on HDFS High-Availability cluster

Hi Phil,

What’s your deployment of your HBase cluster? Is it deployed as a standalone cluster?

Here is a blog which have mentioned the settings of NN HA(http://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/).
But the scene is deploying HBase cluster as a standalone cluster.

See if it can help you.

Best Regards

Lijun Cao


在 2018年11月13日,02:39,Phil Scott <phil.scott@pricespider.com<mailto:phil.scott@pricespider.com>>
写道:

Folks,

My real question:  Are there any settings in kylin.properties, or in the hdfs-site.xml or
hive-site.xml, that can clue Kylin into the required syntax for HA HDFS urls?

Background:

I have been running Kylin 2.3.1 for almost a year (very happily), on a Horton HDP 2.6 cluster.
 This weekend, my HDFS namenode had an issue and went down.  I decided to upgrade it to HDFS
High Availability mode.
See: https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.2/bk_ambari-operations/content/how_to_configure_namenode_high_availability.html
for details.

My HDFS cluster is now operating in HA mode, and now my Kylin Cube Builds are failing on step
1.  They’ve been working fine up until this change.

Once in HA mode, HDFS clients are supposed to recognize from the hdfs-site.conf file that
the HA mode is enabled, and use a different syntax for talking to HDFS urls.  For example,
in the logs for cube-build step 1, Kylin is trying to tell Hive to create an external table
and map its “location” to an HDFS location, using the old NameNode’s hostname directly
(like this…)

(**** CREATE EXTERNAL TABLE code snipped out above ***)

STORED AS SEQUENCEFILE
LOCATION 'hdfs://pschd01.internaldomain.com:8020/<hdfs://pschd01.internaldomain.com:8020/kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f>kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f<hdfs://pschd01.internaldomain.com:8020/kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f>';

(*** ALTER TABLE command comes next ***


In the above, the ‘hdfs://pschd01.internaldomain.com:8020/’<hdfs://pschd01.internaldomain.com:8020/%E2%80%99>
address is directly addressing the old HDFS NameNode.  This throws an error as follows:

Failed with exception Wrong FS: hdfs://pschd01.internaldomain.com:8020<hdfs://pschd01.internaldomain.com:8020/kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f/.hive-staging_hive_2018-11-12_01-22-26_487_4731507377031334971-1/-ext-10000>/kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f/.hive-staging_hive_2018-11-12_01-22-26_487_4731507377031334971-1/-ext-10000<hdfs://pschd01.internaldomain.com:8020/kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f/.hive-staging_hive_2018-11-12_01-22-26_487_4731507377031334971-1/-ext-10000>,
expected: hdfs://bdp01
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask


So, Hive is complaining that it is expecting to see the new HA syntax which is:   hdfs://<ha_service_name>/
 instead of hdfs://<namenode_host>:<namenode_port>/

It looks like Kylin is generating HIVE statements that use the old namenode host syntax, but
needs to somehow be configured to use the new HDFS HA syntax.

I appreciate any help!!!

-Phil


Mime
View raw message