hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu
Date Fri, 20 Nov 2015 11:52:33 GMT
I recommend to use a Hadoop distribution containing these technologies. I think you get also
other useful tools for your scenario, such as Auditing using sentry or ranger.

> On 20 Nov 2015, at 10:48, Mich Talebzadeh <mich@peridale.co.uk> wrote:
> 
> Well
>  
> “I'm planning to deploy Hive on Spark but I can't find the installation steps. I tried
to read the official '[Hive on Spark][1]' guide but it has problems. As an example it says
under 'Configuring Yarn' `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
but does not imply where should I do it. Also as per the guide configurations are set in the
Hive runtime shell which is not permanent according to my knowledge.”
>  
> You can do that in yarn-site.xml file which is normally under $HADOOP_HOME/etc/hadoop.
>  
>  
> HTH
>  
>  
>  
> Mich Talebzadeh
>  
> Sybase ASE 15 Gold Medal Award 2008
> A Winning Strategy: Running the most Critical Financial Data on ASE 15
> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
> Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.
> co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4
> Publications due shortly:
> Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8
> Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly
>  
> http://talebzadehmich.wordpress.com
>  
> NOTE: The information in this email is proprietary and confidential. This message is
for the designated recipient only, if you are not the intended recipient, you should destroy
it immediately. Any information in this message shall not be understood as given or endorsed
by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated.
It is the responsibility of the recipient to ensure that this email is virus free, therefore
neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>  
> From: Dasun Hegoda [mailto:dasunhegoda@gmail.com] 
> Sent: 20 November 2015 09:36
> To: user@hive.apache.org
> Subject: Hive on Spark - Hadoop 2 - Installation - Ubuntu
>  
> Hi,
>  
> What I'm planning to do is develop a reporting platform using existing data. I have an
existing RDBMS which has large number of records. So I'm using. (http://stackoverflow.com/questions/33635234/hadoop-2-7-spark-hive-jasperreports-scoop-architecuture)
>  
>  - Scoop - Extract data from RDBMS to Hadoop
>  - Hadoop - Storage platform -> *Deployment Completed*
>  - Hive - Datawarehouse
>  - Spark - Read time processing -> *Deployment Completed*
>  
> I'm planning to deploy Hive on Spark but I can't find the installation steps. I tried
to read the official '[Hive on Spark][1]' guide but it has problems. As an example it says
under 'Configuring Yarn' `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
but does not imply where should I do it. Also as per the guide configurations are set in the
Hive runtime shell which is not permanent according to my knowledge.
>  
> Given that I read [this][2] but it does not have any steps.
>  
> Please provide me the steps to run Hive on Spark on Ubuntu as a production system?
>  
>  
>   [1]: https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
>   [2]: http://stackoverflow.com/questions/26018306/how-to-configure-hive-to-use-spark
>  
> --
> Regards,
> Dasun Hegoda, Software Engineer  
> www.dasunhegoda.com | dasunhegoda@gmail.com

Mime
View raw message