hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "小网客" <smallnetvisi...@foxmail.com>
Subject Re:from relational to bigger data
Date Fri, 20 Dec 2013 14:09:44 GMT
Writing an inputformat then use mapreduce job to deal with your relational data


------------------
---------------------------------------------
BestWishes!
小网客
Blog:http://snv.iteye.com/
Email:1134687142@qq.com
 

 




------------------ Original ------------------
From:  "Jay Vee";<jvsrvcs@gmail.com>;
Date:  Fri, Dec 20, 2013 04:35 AM
To:  "user"<user@hadoop.apache.org>; 

Subject:  from relational to bigger data



We have a large relational database ( ~ 500 GB, hundreds of tables ).  

We have summary tables that we rebuild from scratch each night that takes about 10 hours.
 
From these summary tables, we have a web interface that accesses the summary tables to build
reports.
 
There is a business reason for doing a complete rebuild of the summary tables each night,
and using
views (as in the sense of Oracle views) is not an option at this time.

If I wanted to leverage Big Data technologies to speed up the summary table rebuild, what
would be the first step into getting all data into some big data storage technology?
 
Ideally in the end, we want to retain the summary tables in a relational database and have
reporting work the same without modifications.

It's just the crunching of the data and building these relational summary tables where we
need a significant performance increase.
Mime
View raw message