kylin-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiaoxiang Yu <xiaoxiang...@kyligence.io>
Subject Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera
Date Wed, 04 Sep 2019 03:39:32 GMT
Dear Gourav,
  Thank you for your update.

----------------
Best wishes,
Xiaoxiang Yu


发件人: Gourav Gupta <techgouravgupta@gmail.com>
日期: 2019年9月4日 星期三 00:09
收件人: Xiaoxiang Yu <xiaoxiang.yu@kyligence.io>, Wang rupeng <wangrupeng@live.cn>
抄送: "dev@kylin.apache.org" <dev@kylin.apache.org>
主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

Dear Xiaoxiang,

Thanks for the helpful reply. Please be apprised, have resolved all the issues and now I am
able to create a cube with MapReduce mode. Last caveat i.e. "FAILED: Execution Error, return
code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" is resolved while I configured
the "hive.auto.convert.join = false" in kylin-hive-site.xml.

Thanks for the support and appreciates the quick response from you and Kylin Team. I will
take your help in future as well if I face any other issue when building a cube with spark
mode.

Best Regards,
Gourav Gupta

On Sun, Sep 1, 2019 at 10:54 AM Xiaoxiang Yu <xiaoxiang.yu@kyligence.io<mailto:xiaoxiang.yu@kyligence.io>>
wrote:
Hi friend,
  I feel so glad to hear you have resolved some problem after a lot effort, and it is very
kind of you to share something you found about  kylin-port-replace-util.sh with us.
  It seems that you meet another trouble of the first step of your cube building, using Hive
to create a flat table. As far as I can see, the message provided by you “FAILED: Execution
Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask”indicated that
your Hive is NOT configured in right way. Your Hive command run in local mode other than Yarn
mode. It is strange, did your node which you choose to deploy Kylin is configured in correct
way? Maybe you should ask your Hadoop administrator for help. Or could you please provided
more detail about how your deploy Kylin?
   If you use Kylin for the first time and you are familiar with Docker, maybe you can run
a docker container to have a technical preview. Please refer to http://kylin.apache.org/docs/install/kylin_docker.html.

----------------
Best wishes,
Xiaoxiang Yu


发件人: Gourav Gupta <techgouravgupta@gmail.com<mailto:techgouravgupta@gmail.com>>
日期: 2019年9月1日 星期日 01:24
收件人: Wang rupeng <wangrupeng@live.cn<mailto:wangrupeng@live.cn>>, Xiaoxiang
Yu <xiaoxiang.yu@kyligence.io<mailto:xiaoxiang.yu@kyligence.io>>, "dev@kylin.apache.org<mailto:dev@kylin.apache.org>"
<dev@kylin.apache.org<mailto:dev@kylin.apache.org>>
主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

Dear Wang and Xiaoxiang,
Thanks for providing the suggestions and solutions for all those queries which I had mentioned
in the previous trailing mail. Truly appreciated!!!

As the answers have been received from you, I did the port number amendment in  "./$KYLIN_HOME/bin/Kylin-port-replace-util.sh
set", but still thereafter I was facing with the same issue. After doing hours of brainstorming,
I was able to resolve the aforesaid issue(Not able to access Kylin UI), Actually, one of the
java application was running on 9009 port no. and we also know that Kylin takes 3 ports 7070,9009
& 7443. Was able to access the Kylin Web UI while I stopped the already running script
on 9009.

At this time I am facing with one caveat i.e "FAILED: Execution Error, return code 3 from
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when I am going to create a cube in Map-Reduce
mode. I googled the same and did the amendment( Kylin and Hive property) as per the solution
I got over the shared link(https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask)
but still, I am not able to resolve.

Please let me know is there any way of resolving this issue. Attaching the screenshot of the
error.

Thanks in advance.

Best Regards,
Gourav Gupta

On Sat, Aug 31, 2019 at 10:49 PM Gourav Gupta <techgouravgupta@gmail.com<mailto:techgouravgupta@gmail.com>>
wrote:
Dear Wang and Xiaoxiang,
Thanks for providing the suggestions and solutions for all those queries which I had mentioned
in the previous trailing mail. Truly appreciated!!!

As the answers have been received from you, I did the port number amendment in  "./$KYLIN_HOME/bin/Kylin-port-replace-util.sh
set", but still thereafter I was facing with the same issue. After doing hours of brainstorming,
I was able to resolve the aforesaid issue(Not able to access Kylin UI), Actually, one of the
java application was running on 9009 port no. and we also know that Kylin takes 3 ports 7070,9009
& 7443. Was able to access the Kylin Web UI while I stopped the already running script
on 9009.

At this time I am facing with one caveat i.e "FAILED: Execution Error, return code 3 from
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when I am going to create a cube in Map-Reduce
mode. I googled the same and did the amendment( Kylin and Hive property) as per the solution
I got over the shared link(https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask)
but still, I am not able to resolve.

Please let me know is there any way of resolving this issue. Attaching the screenshot of the
error.

Thanks in advance.

Best Regards,
Gourav Gupta




On Fri, Aug 30, 2019 at 1:00 PM Wang rupeng <wangrupeng@live.cn<mailto:wangrupeng@live.cn>>
wrote:
Hi Gupta,
    You can change kylin port by using following command and new port is 7070 plus the number
you set:
    ./$KYLIN_HOME/bin/kylin-port-replace-util.sh set <number>
    If kylin web UI cannot be opened, you can check kylin log which is $KYLIN_HOME/logs/kylin.log
to see more details.
There are some suggestions for your doubts:
    1. You need to add environment variable SPARK_HOME=/local/path/to/spark so that you can
start kylin successfully even though you don't use spark to build cube. And you'd better using
suggested version of spark(spark-2.3.2), you can download it by ./$KYLIN_HOME/bin/down-spark.sh
.
    2. Kylin supported cdh vertion is cdh5.7+, cdh6.0, cdh6.1 and you don't have to care about
HBase version if you are using cdh. In case you are using cdh5.16, you can download  apache-kylin-<version>-bin-cdh57.tar.gz
from http://kylin.apache.org/download/
    3. You don't have to install kylin on master node, any other node in cluster would be
OK.

-------------------
Best wishes,
Rupeng Wang


发件人: Gourav Gupta <techgouravgupta@gmail.com<mailto:techgouravgupta@gmail.com>>
日期: 2019年8月30日 星期五 02:03
收件人: Wang rupeng <wangrupeng@live.cn<mailto:wangrupeng@live.cn>>
抄送: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" <dev@kylin.apache.org<mailto:dev@kylin.apache.org>>
主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

Thanks a lot Wang for the prompt helpful reply. Actually today I have removed the old version
of Kylin and installed successfully apache Kylin 2.6 for CDH mode but now at this time, we
are unable to open Kylin WEB UI. Even though I have had changed port number 7070 to some other
number in server.xml(Tomcat directory), but still facing the same issue.

I have some doubts while configuring the Kylin which are mentioned below:

1. Would I have to write the path of spark master node or path of spark which has come with
Kylin?
2.Which tar file will be suitable for cloudera 5.16 ?? What is the need of Kylin-HBase version?
3.should  I need to install and configured Kylin on master node? will installation over the
edge node work?

Actually, we are trying to switch the visualization layer from SQL(OLAP) - PowerBI pipeline
to KYLIN-Mean Stack (Open Source/Enterprise version ). So your help is much appreciated on
the same.

I am waiting for your positive response.


Regards,
Gourav Gupta

On Thu, Aug 29, 2019 at 5:43 PM Wang rupeng <wangrupeng@live.cn<mailto:wangrupeng@live.cn>>
wrote:
Hi,
    It seems the problem is following
    "60505 [dispatcher-event-loop-6] ERROR  org.apache.spark.scheduler.cluster.YarnScheduler
 - Lost executor 1 on *********: Container marked as failed:"
It usually comes out with not enough memory for your yarn so that yarn container is closed
because of lack of memory , you can go to yarn resource manager web page to see more details
with yarn log.
        If it's the memory issue, you can try to allocate more memory for spark yarn executor
by change the following configuration item in "$KYLIN_HOME/conf/kylin.properties"
    kylin.engine.spark-conf.spark.yarn.executor.memoryOverhead=384


-------------------
Best wishes,
Rupeng Wang


在 2019/8/29 14:57,“Gourav Gupta”<techgouravgupta@gmail.com<mailto:techgouravgupta@gmail.com>>
写入:

    Hi Sir,

    I have installed and configured Apache Kylin 2.4 on Cloudera Platform for
    creating the Cube.

    I have been able to create a cube in MapReduce mode but getting the
    below-mentioned caveat while executes on spark mode. I have had followed
    all the steps and tried many remedies for debugging the problem.



    Please let me know how to resolve this bug. Thanks in Advance.





    1091 [main] ERROR org.apache.spark.SparkContext  - Error adding jar
    (java.lang.IllegalArgumentException: requirement failed: JAR
    kylin-job-2.4.0.jar already registered.), was the --addJars option used?

    [Stage 0:>                                                          (0 + 0)
    / 2]
    [Stage 0:>                                                          (0 + 2)
    / 2]


    60505 [dispatcher-event-loop-6] ERROR
    org.apache.spark.scheduler.cluster.YarnScheduler  - Lost executor 1 on **
    *******: Container marked as failed:
    container_e62_1566915974858_6628_01_000003 on host: *******. Exit status:
    50. Diagnostics: Exception from container-launch.
    Container id: container_e62_1566915974858_6628_01_000003
    Exit code: 50
    Stack trace: ExitCodeException exitCode=50:
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:604)
    at org.apache.hadoop.util.Shell.run(Shell.java:507)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)
    at
    org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213)
    at
    org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at
    org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at
    java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at
    java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)


    Container exited with a non-zero exit code 50

    82664 [dispatcher-event-loop-5] ERROR
    org.apache.spark.scheduler.cluster.YarnScheduler
     - Lost executor 2 on *******: Container marked as failed:
    container_e62_1566915974858_6628_01_000004 on host: *******. Exit status:
    50. Diagnostics: Exception from container-launch.
    Container id: container_e62_1566915974858_6628_01_000004
    Exit code: 50
    Stack trace: ExitCodeException exitCode=50:
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:604)
    at org.apache.hadoop.util.Shell.run(Shell.java:507)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.
    launchContainer(DefaultContainerExecutor.java:213)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.
    launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.
    launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(
    ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(
    ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)


    Container exited with a non-zero exit code 50


    The command is:
    export HADOOP_CONF_DIR=/etc/hadoop/conf && /usr/lib/spark/bin/spark-submit
    --class org.apache.kylin.common.util.SparkEntry  --conf
    spark.executor.instances=1  --conf spark.yarn.archive=hdfs://
    namenode:8020/kylin/spark/spark-libs.jar  --conf spark.yarn.queue=default
     --conf spark.yarn.am<http://spark.yarn.am>.extraJavaOptions=-Dhdp.version=current
 --conf
    spark.history.fs.logDirectory=hdfs:///kylin/spark-history  --conf
    spark.driver.extraJavaOptions=-Dhdp.version=current  --conf
    spark.io.compression.codec=org.apache.spark.io<http://org.apache.spark.io>.SnappyCompressionCodec
     --conf spark.master=yarn  --conf
    spark.executor.extraJavaOptions=-Dhdp.version=current
     --conf spark.hadoop.yarn.timeline-service.enabled=false  --conf
    spark.executor.memory=4G  --conf spark.eventLog.enabled=true  --conf
    spark.eventLog.dir=hdfs:///kylin/spark-history  --conf
    spark.executor.cores=2  --conf spark.submit.deployMode=cluster --jars
    /opt/apache-kylin-2.4.0-bin-cdh57/lib/kylin-job-2.4.0.jar
    /opt/apache-kylin-2.4.0-bin-cdh57/lib/kylin-job-2.4.0.jar -className
    org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable
    default.kylin_intermediate_kylin_sales_cube_c1526d16_9719_4dec_be41_346f43654e02
    -input hdfs://nameservice1/kylin/kylin_metadata/kylin-2159d40b-
    f14e-4500-af95-1fbfd5a4073f/kylin_intermediate_kylin_
    sales_cube_c1526d16_9719_4dec_be41_346f43654e02 -segmentId
    c1526d16-9719-4dec-be41-346f43654e02 -metaUrl kylin_metadata@hdfs,path=hdfs:
    //nameservice1/kylin/kylin_metadata/kylin-2159d40b-f14e-
    4500-af95-1fbfd5a4073f/kylin_sales_cube/metadata -output
    hdfs://nameservice1/kylin/kylin_metadata/kylin-2159d40b-
    f14e-4500-af95-1fbfd5a4073f/kylin_sales_cube/cuboid/ -cubename
    kylin_sales_cube
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message