spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neeraj Garg02 <Neeraj_Gar...@infosys.com>
Subject YARN deployment of Spark and Thrift JDBC server
Date Tue, 14 Oct 2014 11:31:07 GMT
Hi All,

I've downloaded and installed Apache Spark 1.1.0 pre-built for Hadoop 2.4.

Now, I want to test two features of Spark:

1.       YARN deployment : As per my understanding, I need to modify "spark-defaults.conf"
file with the settings mentioned at URL http://spark.apache.org/docs/1.1.0/running-on-yarn.html#configuration
. For example, settings like spark.yarn.applicationMaster.waitTries etc.

In order to launch a Spark application in yarn-cluster mode, following command can be used
once the configurations are done.
./bin/spark-submit --class path.to.your.Class --master yarn-cluster [options] <app jar>
[app options]

Is this understanding correct or please suggest with the steps to Deploy Spark on YARN.


2.       Testing Thrift JDBC server connection: I've Hadoop 2.4 cluster setup. Apache spark
is running on this cluster. Now, in order to test JDC thrift server, I've successfully followed
the steps mentioned in the "Other SQL Interfaces" section of Spark SQL programming guide i.e.
I can see beeline prompt and it's connected to thrift server using the given command. Please
help me to get answers of following queries:

a.       Which kind of queries I can execute using this beeline prompt. Would these be Spark
SQL queries or Hive queries?

b.      Configuration of Hive is done by placing your hive-site.xml file in conf/. Right now,
I don't have Hive installed as part of the Hadoop 2.4 cluster. Do I need to install Hive to
test the Thrift JDBC server OR to execute Spark SQL queries from the beeline prompt.

                                                               i.      In case Hive installation
is a pre-requisite, then,  is there a need to re-build the Spark package. What are the steps
for these. Is internet required for the re-build?

c.       What else would I need in case I need to connect BI tools with Spark SQL using Thrift
JDBC/ ODBC server. Please share the steps or pointers to do the same.

As I could not find sufficient information on the same, please help.

Please let me know if more information/ explanation is required.

Thanks and Regards,
Neeraj Garg


**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are not 
to copy, disclose, or distribute this e-mail or its contents to any other person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken 
every reasonable precaution to minimize this risk, but is not liable for any damage 
you may sustain as a result of any virus in this e-mail. You should carry out your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

Mime
View raw message