kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "kangkaisen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-2995) Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cubing
Date Thu, 07 Dec 2017 02:39:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281254#comment-16281254
] 

kangkaisen commented on KYLIN-2995:
-----------------------------------

Not about performance, It's a bug.

Like the method {{bindCurrentConfiguration}}  in {{KylinMapper}} and {{KylinReducer}}, All
MR job must call this method first, Because we must ensure we use the  {{context.getConfiguration()}}
for HDFS, not the default Configuration. It's the same thing in Spark.

For example´╝î If the following config exists in Kylin server's mountTable.xml,  doesn't exists
in DN node's mountTable.xml. When Kylin Spark job visit hdfs://XXXX/kylin, The  {{FileNotFoundException}}
will throw.

{code:java}
  <property>
    <name>fs.viewfs.mounttable.XXXX.link./kylin</name>
    <value>hdfs://XXXX/kylin</value>
  </property>
{code}


> Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cubing
> ------------------------------------------------------------------
>
>                 Key: KYLIN-2995
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2995
>             Project: Kylin
>          Issue Type: Bug
>          Components: Spark Engine
>    Affects Versions: v2.1.0
>            Reporter: kangkaisen
>            Assignee: kangkaisen
>         Attachments: KYLIN-2995.patch
>
>
> Currenly, we load metadata from HDFS in SparkCubing:{{AbstractHadoopJob.loadKylinConfigFromHdfs}},
But HadoopUtil will use new Configuration, we should use SparkContext.hadoopConfiguration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message