flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhijeet Pathak <Abhijeet.Pat...@kpitcummins.com>
Subject RE: Processing data from HDFS
Date Fri, 25 Jan 2013 05:20:41 GMT
I've evaluated Pig, but it's not suitable for my purpose.

Because, the CSV files that I have can have different column names, and column sequence for
each file.
Also, the key is not present there in CSV. We need to calculate row Key for each record also.

Regards,
Abhijeet Pathak


________________________________________
From: Alexander Alten-Lorenz [wget.null@gmail.com]
Sent: 24 January 2013 1:10 PM
To: user@flume.apache.org
Subject: Re: Processing data from HDFS

Use PIG, a well written example you can find here:
http://blog.whitepages.com/2011/10/27/hbase-storage-and-pig/

Regards

On Jan 24, 2013, at 8:29 AM, Nitin Pawar <nitinpawar432@gmail.com> wrote:

> how are the files coming to hdfs?
>
> there is a direct hbase sink available for wrting data into hbase
>
> also from hdfs to hbase, you will need to write your own mapreduce job to
> put data in hbase
>
>
> On Thu, Jan 24, 2013 at 12:50 PM, Abhijeet Pathak <
> Abhijeet.Pathak@kpitcummins.com> wrote:
>
>> Hi,
>>
>> I've a folder in HDFS where a bunch of files gets created periodically.
>> I know that currently Flume does not support reading from HDFS folder.
>>
>> What is the best way to transfer this data from HDFS to Hbase (with or
>> without using Flume)?
>>
>>
>> Regards,
>> Abhijeet Pathak
>>
>>
>>
>
>
> --
> Nitin Pawar

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF



Mime
View raw message