flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: strange flume hdfs put
Date Tue, 19 Feb 2013 03:51:57 GMT
See comment below.  

--  
Hari Shreedharan


On Monday, February 18, 2013 at 7:43 PM, 周梦想 wrote:

> hello,
> I change the conf file like this:
> [zhouhh@Hadoop48 flume1.3.1]$ cat conf/testhdfs.conf
> syslog-agent.sources = Syslog
> syslog-agent.channels = MemoryChannel-1
> syslog-agent.sinks = HDFS-LAB
>  
> syslog-agent.sources.Syslog.type = syslogTcp
> syslog-agent.sources.Syslog.port = 5140
>  
> syslog-agent.sources.Syslog.channels = MemoryChannel-1
> syslog-agent.sinks.HDFS-LAB.channel = MemoryChannel-1
>  
> syslog-agent.sinks.HDFS-LAB.type = hdfs
>  
> syslog-agent.sinks.HDFS-LAB.hdfs.path = hdfs://Hadoop48:54310/flume/%{host}
> syslog-agent.sinks.HDFS-LAB.hdfs.file.Prefix = syslogfiles
> syslog-agent.sinks.HDFS-LAB.hdfs.file.rollInterval = 60
> #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = SequenceFile
> #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream  
>  
>  
>  

You need to uncomment the above line and change it to: syslog-agent.sinks.HDFS-LAB.hdfs.fileType
= DataStream
> #syslog-agent.sinks.HDFS-LAB.hdfs.file.writeFormat= Text
> syslog-agent.channels.MemoryChannel-1.type = memory
>  
> and I test again:
> [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " | nc
-v hadoop48 5140
> Connection to hadoop48 5140 port [tcp/*] succeeded!
> [zhouhh@Hadoop47 ~]$ hadoop fs -cat hdfs://Hadoop48:54310/flume//FlumeData.1361245092567.tmp
> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒▒▒ʣ
>                                                                             g▒▒C%<
<▒▒)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47 ~]$
>  
>  
> there still some text seems error.  
>  
> Andy
> 2013/2/19 Hari Shreedharan <hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)>
> > This is because the data is written out by default in Hadoop's SequenceFile format.
Use the DataStream file format (as in the Flume docs) to get the event parsed as is (if you
use the default serializer, the headers will not be serialized, do make sure you select the
correct serializer).  
> >  
> >  
> > Hari  
> >  
> > --  
> > Hari Shreedharan
> >  
> >  
> > On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote:
> >  
> > > hello,
> > > I put some data to hdfs via flume 1.3.1,but it changed!
> > >  
> > > source data:
> > > [zhouhh@Hadoop47 ~]$  echo "<13>Mon Feb 18 18:25:26 2013 hello world
zhh " | nc -v hadoop48 5140
> > > Connection to hadoop48 5140 port [tcp/*] succeeded!
> > >  
> > >  
> > > the flume agent received:
> > > 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp
> > > 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp
to hdfs://Hadoop48:54310/flume/FlumeData.1361241606972
> > >  
> > >  
> > > the content in hdfs:  
> > >  
> > > [zhouhh@Hadoop47 ~]$ hadoop fs -cat  hdfs://Hadoop48:54310/flume/FlumeData.1361241606972
> > > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon
Feb 18 18:25:26 2013 hello world zhh  
> > > [zhouhh@Hadoop47 ~]$
> > >  
> > >  
> > > I don't know why there is some data like "org.apache.hadoop.io.LongWritable",there
are some bugs?
> > >  
> > > Best Regards,
> > > Andy
> > >  
> >  
>  


Mime
View raw message