hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Mattmann <mattm...@apache.org>
Subject Re: best solution for data ingestion
Date Mon, 04 Nov 2013 13:32:30 GMT
Hi Guys,

Depending on the *type* of ingestion you are trying to do into HDFS,
the combination of Apache OODT (http://oodt.apache.org/) and Apache
Tika (http://tika.apache.org/) may do the trick.

Cheers,
Chris



-----Original Message-----
From: Bing Jiang <jiangbinglover@gmail.com>
Reply-To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Date: Monday, November 4, 2013 2:34 AM
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: Re: best solution for data ingestion

>Apache Pig is also a solution for data ingest, which gives more flexible
>in functionality and more efficient in development.
>
>
>Regards.
>Bing
>
>
>2013/11/2 Marcel Mitsuto F. S. <mitsuto@gmail.com>
>
>I've done some testing with flume, but ended up using syslog-ng, more
>flexible, reliable, and with a lower fingerprint.
>
>
>On Fri, Nov 1, 2013 at 3:57 PM, Mirko Kämpf
><mirko.kaempf@gmail.com> wrote:
>
>Have a look on Sqoop for data from RDBMS or Flume, if data flows and
>multiple sources have to be used.
>Best wishes
>Mirko
>
>
>
>2013/11/1 Siddharth Tiwari <siddharth.tiwari@live.com>
>
>hi team
>
>seeking your advice on what could be best way to ingest a lot of data to
>hadoop. Also what are views about fuse ?
>
>
>*------------------------*
>Cheers !!!
>SiddharthTiwari
>Have a refreshing day !!!
>"Every duty is holy, and devotion to duty is the highest form of worship
>of God.”
>
>"Maybe other people will try to limit me but I don't limit myself"
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>-- 
>Bing Jiang
>Tel:(86)134-2619-1361
>weibo: http://weibo.com/jiangbinglover
>BLOG: www.binospace.com <http://www.binospace.com>
>BLOG: http://blog.sina.com.cn/jiangbinglover
>
>Focus on distributed computing, HDFS/HBase
>
>
>



Mime
View raw message