hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashok Kumar <ashok34...@yahoo.com>
Subject Re: The advantages of Hive/Hadoop comnpared to Data Warehouse
Date Sat, 19 Dec 2015 00:42:41 GMT
Hi,
Thanks for the info. I understand ELT (Extract, Load, Transform) is more appropriate for big
data compared to traditional ETL. What are the major advantages of this in Big Data space.
Example. if I started using Sqoop to get data from traditional transactional and Data Warehouse
databases and create the same tables in Hive, what would be the next step to get to a consolidated
data model in Hive on HDFS. The entry tables will be tabular tables in line with source, correct? How
many ELT steps need to apply generally to get to the final model. Will ELT speed up this process
I understand this is a very broad question. However, any comments will be welcome.
Regards
 

    On Friday, 18 December 2015, 22:27, Jörn Franke <jornfranke@gmail.com> wrote:
 

 I think you should draw more the attention that Hive is just one component in the ecosystem.
You can have many more components, such as ELT, integrating unstructured data, machine learning,
streaming data etc. however usually analysts are not aware about the technologies and it staff
is not much aware of how it can bring benefits to a specific business domain. You could explore
the potentials together in workshops, design thinking etc. once you know more details, both
sides decide on potential ways forward you can start doing PoCs and see what works and what
not. It is important that you break old ties created by more traditional data warehouse approaches
in the past and go beyond the comfort zone.
On 18 Dec 2015, at 22:01, Ashok Kumar <ashok34668@yahoo.com> wrote:


 Gurus,
Some analysts keep asking me the advantages of having Hive tables when the star schema in
Data Warehouse (DW) does the same.
For example if you have fact and dimensions table in DW and just import them into Hive via
a say SQOOP, what are we going to gain.
I keep telling them storage economy and cheap disks, de-normalisation can be done further
etc. However, they are not convinced :(
Any additional comments will help my case.
Thanks a lot


  
Mime
View raw message