When I try building a new cube segment with spark(on yarn cluster) engine, I always get í«fail to locate kylin.propertiesí» error (KylinConfg.java line260) in step í░Build Cube with Sparkí▒. After some debugging, it turns out that the spark executor tries to load kylin property file from its local $kylin_home directory which actually does not exist in any of our yarn cluster node. So I assume that it is required to have kylin environment, e.g. kylin_home, all property files etc. to be set up on every node in the yarn cluster beforehand. Is that true?
What I did to solve this problem is as follows. I put kylin.propertes file in the spark clientí»s config directory so that kylin.properties will be uploaded to all the executorsí» working directory as soon as the spark task starts. Then I change the kylin source code a bit: if the executor caní»t find kylin.properties file in the local $kylin_home, it will try to load the file from executorí»s working directory. After these changes, the job can run without errors now.
So my point is, is it designed to have all kylin environment set up on the yarn cluster? Did I miss anything? Is it better just to upload those kylin prepoerties file to yarn cluster only during run time? What do you guys think? Thanks!