hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Partho Bardhan <partho.bardha...@gmail.com>
Subject Re: ETL/DW to Hadoop migrations
Date Tue, 08 Sep 2015 23:55:40 GMT
Hi Abhishek,

Indeed these tools are expensive and we see many customers opting for a lot
of open source projects to reduce license costs, prevent vendor lock-in and
have a steady group of software engineers making the projects more industry

If you are looking at offloading from Datastage and Teradata, I could
suggest these open source projects


Spring XD allows you to ingest data from traditional EDW, streams and flat
files. It is designed for parallel, high-speed ingest.

Another project allow you to perform SQL directly from flat files is
Pivotal Hawq. Pivotal Hawq is a SQL on Hadoop analytic engine. You can find
more about it on


If you would like to try out some of our Enterprise Grade as well as Beta
offerings, please look here: https://network.pivotal.io


Partho Bardhan
Data Engineering

On Tue, Sep 8, 2015 at 2:54 PM, Sandesh Hegde <sandesh.hegde@gmail.com>

> Hello Abhishek,
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
> https://www.datatorrent.com/product/datatorrent-dtingest/
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
> Thanks
> Sandesh
> PS: I work for DataTorrent.
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23singhabhishek@gmail.com>
> wrote:
>> Hi Kishore,
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>> Any links would do more than good.
>> Thanks once again.
>> Abhishek
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>> Abhishek,
>>>    Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>> Thanks,
>>> Kishore
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23singhabhishek@gmail.com> wrote:
>>>> Hi guys,
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>>> which reduces cost without much degrade in performance.  Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>>> about source data,  it is mainly in the form of flat files and database.
>>>> Thanks in advance.
>>>> Regards,
>>>> Abhishek Singh

View raw message