hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Partho Bardhan <partho.bardha...@gmail.com>
Subject Re: ETL/DW to Hadoop migrations
Date Tue, 08 Sep 2015 23:55:40 GMT
Hi Abhishek,

Indeed these tools are expensive and we see many customers opting for a lot
of open source projects to reduce license costs, prevent vendor lock-in and
have a steady group of software engineers making the projects more industry
friendly.

If you are looking at offloading from Datastage and Teradata, I could
suggest these open source projects

http://projects.spring.io/spring-xd/

Spring XD allows you to ingest data from traditional EDW, streams and flat
files. It is designed for parallel, high-speed ingest.

Another project allow you to perform SQL directly from flat files is
Pivotal Hawq. Pivotal Hawq is a SQL on Hadoop analytic engine. You can find
more about it on

http://pivotal.io/big-data/pivotal-hawq

If you would like to try out some of our Enterprise Grade as well as Beta
offerings, please look here: https://network.pivotal.io

Thanks

Partho Bardhan
Data Engineering
Pivotal



On Tue, Sep 8, 2015 at 2:54 PM, Sandesh Hegde <sandesh.hegde@gmail.com>
wrote:

> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23singhabhishek@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>>    Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23singhabhishek@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>>> which reduces cost without much degrade in performance.  Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>>> about source data,  it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>

Mime
View raw message