Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of jvsrvcs@gmail.com designates
 209.85.216.173 as permitted sender)
MIME-Version: 1.0
Date: Thu, 19 Dec 2013 13:35:29 -0700
Message-ID: 
 <CADq_mb-vbBj2VveNDGet6dj+WpDGMkcWfr76bqOSSoWHn_Xx8Q@mail.gmail.com>
Subject: from relational to bigger data
From: Jay Vee <jvsrvcs@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=047d7bf0e1f43898ff04ede91a51

--047d7bf0e1f43898ff04ede91a51
Content-Type: text/plain; charset=ISO-8859-1

We have a large relational database ( ~ 500 GB, hundreds of tables ).

We have summary tables that we rebuild from scratch each night that takes
about 10 hours.
>From these summary tables, we have a web interface that accesses the
summary tables to build reports.

There is a business reason for doing a complete rebuild of the summary
tables each night, and using
views (as in the sense of Oracle views) is not an option at this time.

If I wanted to leverage Big Data technologies to speed up the summary table
rebuild, what would be the first step into getting all data into some big
data storage technology?

Ideally in the end, we want to retain the summary tables in a relational
database and have reporting work the same without modifications.

It's just the crunching of the data and building these relational summary
tables where we need a significant performance increase.

--047d7bf0e1f43898ff04ede91a51
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">We have a large relational database ( ~ 500 GB, hundreds o=
f tables ).=A0 <br><br>We have summary tables that we rebuild from scratch =
each night that takes about 10 hours.=A0 <br>From these summary tables, we =
have a web interface that accesses the summary tables to build reports.<br>
<br>There is a business reason for doing a complete rebuild of the summary =
tables each night, and using<br>views (as in the sense of Oracle views) is =
not an option at this time.<br><br>If I wanted to leverage Big Data technol=
ogies to speed up the summary table rebuild, what would be the first step i=
nto getting all data into some big data storage technology?<br>
<br>Ideally in the end, we want to retain the summary tables in a relationa=
l database and have reporting work the same without modifications.<br><br>I=
t&#39;s just the crunching of the data and building these relational summar=
y tables where we need a significant performance increase.<br>
<br><br></div>

--047d7bf0e1f43898ff04ede91a51--