flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Emile Kao" <emile...@gmx.net>
Subject Re: Flume and HDFS integration
Date Fri, 30 Nov 2012 08:51:15 GMT
Hello Brock,
first of all thank you for answering my questions. I appreciate it since I am a real newbie
in Flume / Hadoop , etc...

But now I am confused. According to you statement, the filetype is the key here. Now just
take a look on my flume.conf below: The filetype was from set to "DataStream".
Now which is the right one now: SequenceFile, DataStream or CompressedStream?


agent1.channels = MemoryChannel-2
agent1.channels.MemoryChannel-2.type = memory

agent1.sources = tail
agent1.sources.tail.channels = MemoryChannel-2
agent1.sources.tail.type = exec
agent1.sources.tail.command = tail -F /opt/apache2/logs/access_log

agent1.sinks = HDFS
agent1.sinks.HDFS.channel = MemoryChannel-2
agent1.sinks.HDFS.type = hdfs
agent1.sinks.HDFS.hdfs.file.Type = DataStream
agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000
#agent1.sinks.HDFS.hdfs.path = /mnt/hdfs/data
agent1.sinks.HDFS.hdfs.writeFormat = Text


Many Thanks,
Emile

-------- Original-Nachricht --------
> Datum: Thu, 29 Nov 2012 19:26:37 -0600
> Von: Brock Noland <brock@cloudera.com>
> An: "user@flume.apache.org" <user@flume.apache.org>
> Betreff: Re: Flume and HDFS integration

> HI,
> 
> On Thu, Nov 29, 2012 at 7:17 PM, Roman Shaposhnik <rvs@apache.org> wrote:
> > On Thu, Nov 29, 2012 at 9:18 AM, Brock Noland <brock@cloudera.com>
> wrote:
> >> 1) It's a sequence file, you can change it a text file if you want. See
> >> FileType here http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
> >
> > Don't you also have to change a serialization format to get rid of the
> binary
> > structure completely? IOW, you'd have to add something like:
> >     agent.sinks.hdfsSink.hdfs.serializer =
> > org.apache.flume.serialization.BodyTextEventSerializer
> 
> BodyTextEventSerializer is the default serializer. Serializers decide
> how to turn Events into records while fileType decides what type of
> file the event is written to.
> 
> Brock

Mime
View raw message