hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <>
Subject Need some clarification on this diagram of mine depicting Hive on Spark engine in yarn-client mode
Date Fri, 27 May 2016 17:11:35 GMT
Hi all,

I would appreciate for any comments on this DFD diagram that i created to
describe Hive running on Spark engine in yarn-client mode.


Dr Mich Talebzadeh

LinkedIn *

1) Client in this case a user starts a Hive session. CLI, Beeline that
start a Spark app

2) Spark app deploys yarn-client mode to request resource from Yarn
Resource Manager

2.a) Resource Manager initialises and registers an Application master

2.b) Application Master start Spark GUI on 4040. Is this Spark app or
Application master?

3)  Application Master notifies Spark application

4) Application Master Executor Launcher Resource Tracker (A JVM)

5) An executor is created to launch a job in the worker node

5) The Application Master starts SparkSubmit (a JVM)

6) SparkSunmikt spawns SparkSubmitDriverBootStrapper (AJVM)

6) A task kicks off the Scheduler Backend

7) Executor also starts Resource Monitor

8) The Scheduler Backend spawns Coarse Grain Executor Backend (a JVM)

The resource monitor updates node manager on the status and resource
utilisation. These are resourced in yarn-<USER>-nodemanager-<HOST>.log. The
status is also updated in GUI.

I think some parts are really confusing. I have ben through the yarn
resource manager and node manager to build up the flow. Also looked at the
hive.log as well

View raw message