flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Anchlia <mohitanch...@gmail.com>
Subject Re: Recording clickstream data
Date Sun, 03 Jun 2012 02:22:05 GMT
On Thu, May 31, 2012 at 11:37 PM, alo alt <wget.null@googlemail.com> wrote:

> Hi,
>
> you could use Avro or Syslog, if possible, or write a own source who runs
> as a REST Api.
>
> Yes, flume will create directories per timestamp, take a look into the
> HDFS section in the userguide:
>
> http://archive.cloudera.com/cdh4/cdh/4/flume-ng/FlumeUserGuide.html#h.rxt2g9parmkr
>
> You can use the escape sequences to match your needs. Small article about:
> http://mapredit.blogspot.de/2012/03/flumeng-evolution.html
>
> Thanks! This is exactly what I was looking for. I'll run through some
examples.


> cheers,
>  Alex
>
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>
> On Jun 1, 2012, at 7:14 AM, Mohit Anchlia wrote:
>
> >
> > I am looking at integrating flume ng with our rest service API to record
> click stream data. Flow would be browser sends data to this REST service,
> which then acts as a client and send it to flume async. Flume then stores
> it in hdfs. I just want to make sure that this is a right use of flume.
> >
> > I do have another question, how does flume organizes hdfs files? Does it
> create new directory based on the timestamp? Could someone help me with
> this in understanding how to efficiently organize and store files such that
> data can be clustered based on timestamp?
> >
> >
>
>

Mime
View raw message