Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A26B7104E3 for ; Thu, 19 Dec 2013 20:36:02 +0000 (UTC) Received: (qmail 8544 invoked by uid 500); 19 Dec 2013 20:35:57 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 8407 invoked by uid 500); 19 Dec 2013 20:35:57 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 8400 invoked by uid 99); 19 Dec 2013 20:35:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Dec 2013 20:35:57 +0000 X-ASF-Spam-Status: No, hits=2.0 required=5.0 tests=FROM_LOCAL_NOVOWEL,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jvsrvcs@gmail.com designates 209.85.216.173 as permitted sender) Received: from [209.85.216.173] (HELO mail-qc0-f173.google.com) (209.85.216.173) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Dec 2013 20:35:50 +0000 Received: by mail-qc0-f173.google.com with SMTP id m20so1438770qcx.18 for ; Thu, 19 Dec 2013 12:35:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=s0ihOechaaXWnz9USA8wVmt/MeJ5XzxZAYee/nRafyE=; b=mjiGeszbdclJ/wLk6xHEy2Qm5/VBJxW10Y+U/HJKLGq586xKlp7xTs2zum+ZUeCHcd +cVMAOEmKOy4wJBFKexU2cIdycOpRiDoSLACI3gLVouIX+BD6AeGp40aYC8bYLvvtSvb bDz0RqjdNxtyiGr7Aygh8IMCAMDalUlQ/TvbkLYNyMfcuZoc60awnaJfS7jT7agB912c GVShjqvHxatJ7tI4ujDuk8uptslb0ddMTC0GEUjgbtvInPEuNdEj2KZs4IxQBuO0GBiG YzDjq8sZQXOGJIu+LVBD3n9aSbd03CHs+u7imHWChZvY8Gc6TfmFuUSGlP2y2f2vTMr4 wEUg== MIME-Version: 1.0 X-Received: by 10.224.72.74 with SMTP id l10mr7011409qaj.28.1387485329920; Thu, 19 Dec 2013 12:35:29 -0800 (PST) Received: by 10.229.230.138 with HTTP; Thu, 19 Dec 2013 12:35:29 -0800 (PST) Date: Thu, 19 Dec 2013 13:35:29 -0700 Message-ID: Subject: from relational to bigger data From: Jay Vee To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7bf0e1f43898ff04ede91a51 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bf0e1f43898ff04ede91a51 Content-Type: text/plain; charset=ISO-8859-1 We have a large relational database ( ~ 500 GB, hundreds of tables ). We have summary tables that we rebuild from scratch each night that takes about 10 hours. >From these summary tables, we have a web interface that accesses the summary tables to build reports. There is a business reason for doing a complete rebuild of the summary tables each night, and using views (as in the sense of Oracle views) is not an option at this time. If I wanted to leverage Big Data technologies to speed up the summary table rebuild, what would be the first step into getting all data into some big data storage technology? Ideally in the end, we want to retain the summary tables in a relational database and have reporting work the same without modifications. It's just the crunching of the data and building these relational summary tables where we need a significant performance increase. --047d7bf0e1f43898ff04ede91a51 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
We have a large relational database ( ~ 500 GB, hundreds o= f tables ).=A0

We have summary tables that we rebuild from scratch = each night that takes about 10 hours.=A0
From these summary tables, we = have a web interface that accesses the summary tables to build reports.

There is a business reason for doing a complete rebuild of the summary = tables each night, and using
views (as in the sense of Oracle views) is = not an option at this time.

If I wanted to leverage Big Data technol= ogies to speed up the summary table rebuild, what would be the first step i= nto getting all data into some big data storage technology?

Ideally in the end, we want to retain the summary tables in a relationa= l database and have reporting work the same without modifications.

I= t's just the crunching of the data and building these relational summar= y tables where we need a significant performance increase.


--047d7bf0e1f43898ff04ede91a51--