hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devopam Mittra <devo...@gmail.com>
Subject Re: Is it ok to build an entire ETL/ELT data flow using HIVE queries?
Date Tue, 16 Feb 2016 11:20:58 GMT
+1 for all suggestions provided already.

I have personally use Talend Big Data Studio in conjunction with Hive +
Cron/Autosys to build and manage small DW.
Found it easy to rapidly build and deploy. Helps with email integration etc
which was my custom requirement (spool few reports and share via email at
routine intervals).

regards
Dev

On Tue, Feb 16, 2016 at 4:10 PM, Elliot West <teabot@gmail.com> wrote:

> I'd say that so long as you can achieve a similar quality of engineering
> as is possible with other software development domains, then 'yes, it is
> ok'.
>
> Specifically, our Hive projects are packaged as RPMs, built and released
> with Maven, covered by suites of unit tests developed with HiveRunner, and
> part of the same Jenkins CI process as other Java based projects.
> Decomposing large processes into sensible units is not as easy as with
> other frameworks so this may require more thought and care.
>
> More information here:
> https://cwiki.apache.org/confluence/display/Hive/Unit+testing+HQL
>
> You have many potential alternatives depending on which languages you are
> comfortable using: Pig, Flink, Cascading, Spark, Crunch, Scrunch, Scalding,
> etc.
>
> Elliot.
>
>
> On Tuesday, 16 February 2016, Ramasubramanian <
> ramasubramanian.narayanan@gmail.com> wrote:
>
>> Hi,
>>
>> Is it ok to build an entire ETL/ELT data flow using HIVE queries?
>>
>> Data is stored in HIVE. We have transactional and reference data. We need
>> to build a small warehouse.
>>
>> Need suggestion on alternatives too.
>>
>> Regards,
>> Rams
>
>


-- 
Devopam Mittra
Life and Relations are not binary

Mime
View raw message