flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Anchlia <mohitanch...@gmail.com>
Subject Re: Recording clickstream data
Date Wed, 06 Jun 2012 14:08:22 GMT
I have not used any syslog before in java, what would be the right package
to use?

On Sat, Jun 2, 2012 at 7:22 PM, Mohit Anchlia <mohitanchlia@gmail.com>wrote:

>
>
>  On Thu, May 31, 2012 at 11:37 PM, alo alt <wget.null@googlemail.com>wrote:
>
>> Hi,
>>
>> you could use Avro or Syslog, if possible, or write a own source who runs
>> as a REST Api.
>>
>> Yes, flume will create directories per timestamp, take a look into the
>> HDFS section in the userguide:
>>
>> http://archive.cloudera.com/cdh4/cdh/4/flume-ng/FlumeUserGuide.html#h.rxt2g9parmkr
>>
>> You can use the escape sequences to match your needs. Small article about:
>> http://mapredit.blogspot.de/2012/03/flumeng-evolution.html
>>
>> Thanks! This is exactly what I was looking for. I'll run through some
> examples.
>
>
>> cheers,
>>  Alex
>>
>> --
>> Alexander Alten-Lorenz
>> http://mapredit.blogspot.com
>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>
>> On Jun 1, 2012, at 7:14 AM, Mohit Anchlia wrote:
>>
>> >
>> > I am looking at integrating flume ng with our rest service API to
>> record click stream data. Flow would be browser sends data to this REST
>> service, which then acts as a client and send it to flume async. Flume then
>> stores it in hdfs. I just want to make sure that this is a right use of
>> flume.
>> >
>> > I do have another question, how does flume organizes hdfs files? Does
>> it create new directory based on the timestamp? Could someone help me with
>> this in understanding how to efficiently organize and store files such that
>> data can be clustered based on timestamp?
>> >
>> >
>>
>>
>

Mime
View raw message