Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4251E18429 for ; Tue, 16 Feb 2016 11:21:09 +0000 (UTC) Received: (qmail 48008 invoked by uid 500); 16 Feb 2016 11:21:07 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 47927 invoked by uid 500); 16 Feb 2016 11:21:07 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 47905 invoked by uid 99); 16 Feb 2016 11:21:07 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Feb 2016 11:21:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 2896A1A05C2 for ; Tue, 16 Feb 2016 11:21:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id t_UpkTjFqSax for ; Tue, 16 Feb 2016 11:21:06 +0000 (UTC) Received: from mail-lf0-f43.google.com (mail-lf0-f43.google.com [209.85.215.43]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 6AEAA5FB18 for ; Tue, 16 Feb 2016 11:21:05 +0000 (UTC) Received: by mail-lf0-f43.google.com with SMTP id l143so106415785lfe.2 for ; Tue, 16 Feb 2016 03:21:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=8qVZXvvUDNbTjyuvxobHWFZ/ptssgzZ+Fgy22BmMddc=; b=Rw/8QOOTTspOK4l6JEp+UYySPoRnnjDfLUfd1U/mQk/9ebIlgg2wKm7t1Af64ozOFS an0ugv8Vuhw+9+VGP5yEV9NLuZBUpFzyLwsEZdfI+YxKb25m2B8m3oMN4Ld2shVJuK5P xOvbACCdJtc6BMxbgHDYFpJX8LR3vHPzkbmrh/3S4Ri1cj1EDbbbnRnZ9M9meiE7/hCb E9YxRxNmC08IXIGhidA+lIzJ6T98vTgqy+oTwKNnNLAmCf9INQlPs3Hs9bE/AtZh2RLx 4xNiv7kIlQS8iZAw12kWb33mCO2kyDQ68rvDMGqV6ecVCoG73tnIFrQIJ7i45pm2J4tE WH9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=8qVZXvvUDNbTjyuvxobHWFZ/ptssgzZ+Fgy22BmMddc=; b=TCQhnd09c+/r0qmaQfeBOJN4A8ExnNKBpNhxo8D2hvojzX8NEj1AIQ/rzkxwTB+YbP 1OzvvIQbej3/+5h3t02OlYasjo/xdKFhxvQf/qCbPb8Y3DAsjHFu5KTCD5Q0kP+h1D7M pCEReD/LabdOzdvyKUZEaJeQZZrq2q6lcCf44WqhcpZXSERkUjbhmxCAJ+R5DE8iNm2n 38WgZExRTpipFTc3yvPXrh7AfjR7eHxPIzcVLNNOid1vBM2fLOpQ7VrccGvbFZGVjnEE 2AgUhiht7Cm7TPx8B78pVRMp0DsDFnEXvsFGn/eMNz3D4oS6GB7EfKCuE0xM0yq9vywR eK+A== X-Gm-Message-State: AG10YOTocrkdYt8pHOXU0p4fN6+U3zp5hZgnx2wWHzvjBDxAnkXTZSjNElIqkLVmbqXnxhii13obs8vAfirhjw== MIME-Version: 1.0 X-Received: by 10.25.24.68 with SMTP id o65mr7741481lfi.156.1455621658265; Tue, 16 Feb 2016 03:20:58 -0800 (PST) Received: by 10.112.22.168 with HTTP; Tue, 16 Feb 2016 03:20:58 -0800 (PST) In-Reply-To: References: <3038E6F0-C311-459F-90A3-2B101A75EA2C@gmail.com> Date: Tue, 16 Feb 2016 16:50:58 +0530 Message-ID: Subject: Re: Is it ok to build an entire ETL/ELT data flow using HIVE queries? From: Devopam Mittra To: "user@hive.apache.org" Content-Type: multipart/alternative; boundary=001a11401e28de5e09052be1531a --001a11401e28de5e09052be1531a Content-Type: text/plain; charset=UTF-8 +1 for all suggestions provided already. I have personally use Talend Big Data Studio in conjunction with Hive + Cron/Autosys to build and manage small DW. Found it easy to rapidly build and deploy. Helps with email integration etc which was my custom requirement (spool few reports and share via email at routine intervals). regards Dev On Tue, Feb 16, 2016 at 4:10 PM, Elliot West wrote: > I'd say that so long as you can achieve a similar quality of engineering > as is possible with other software development domains, then 'yes, it is > ok'. > > Specifically, our Hive projects are packaged as RPMs, built and released > with Maven, covered by suites of unit tests developed with HiveRunner, and > part of the same Jenkins CI process as other Java based projects. > Decomposing large processes into sensible units is not as easy as with > other frameworks so this may require more thought and care. > > More information here: > https://cwiki.apache.org/confluence/display/Hive/Unit+testing+HQL > > You have many potential alternatives depending on which languages you are > comfortable using: Pig, Flink, Cascading, Spark, Crunch, Scrunch, Scalding, > etc. > > Elliot. > > > On Tuesday, 16 February 2016, Ramasubramanian < > ramasubramanian.narayanan@gmail.com> wrote: > >> Hi, >> >> Is it ok to build an entire ETL/ELT data flow using HIVE queries? >> >> Data is stored in HIVE. We have transactional and reference data. We need >> to build a small warehouse. >> >> Need suggestion on alternatives too. >> >> Regards, >> Rams > > -- Devopam Mittra Life and Relations are not binary --001a11401e28de5e09052be1531a Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
+1 for all suggestions provided already.

I ha= ve personally use Talend Big Data Studio in conjunction with Hive + Cron/Au= tosys to build and manage small DW.=C2=A0
Found it easy to rapidl= y build and deploy. Helps with email integration etc which was my custom re= quirement (spool few reports and share via email at routine intervals).

regards
Dev

On Tue, Feb 16, 2016 at 4:10 PM= , Elliot West <teabot@gmail.com> wrote:
I'd say that so long as you can achieve a=C2=A0similar qu= ality=C2=A0of engineering as is possible with other software development do= mains, then 'yes, it is ok'.=C2=A0

Specifically,= our Hive projects are packaged as RPMs, built and released with Maven, cov= ered by suites of unit tests developed with HiveRunner, and part of the sam= e Jenkins CI process as other Java based projects. Decomposing large proces= ses into sensible units=C2=A0is not as easy as with other frameworks so thi= s may require=C2=A0more thought and care.

More inf= ormation here:

You h= ave many potential=C2=A0alternatives depending on which languages you are c= omfortable using: Pig, Flink,=C2=A0Cascading, Spark, Crunch, Scrunch, Scald= ing, etc.

Elliot.=


On Tuesday, 16 February 2016, = Ramasubramanian <ramasubramanian.narayanan@gmail.com> wrote:
Hi,

Is it ok to build an entire ETL/ELT data flow using HIVE queries?

Data is stored in HIVE. We have transactional and reference data. We need t= o build a small warehouse.

Need suggestion on alternatives too.

Regards,
Rams



--
Devopam Mittra
Life and Relations are not binary
--001a11401e28de5e09052be1531a--