flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: flume to HDFS log event write
Date Wed, 09 Jan 2013 17:07:29 GMT
It is because you are using SequenceFile as the output format for the HDFS Sink. Change this:
a1.sinks.k1.hdfs.file.Type=DataStream to a1.sinks.k1.hdfs.fileType=DataStream. Also, the log4jappender
does not support patterns (will support from the next release of Flume, or build trunk from
source after applying the patch attached to https://issues.apache.org/jira/browse/FLUME-1818).
The log4jappender appends the severity and logger information to the flume headers, so without
this patch, you need to write your own serializer (or use the HeaderAndBodyTextSerializer
which is not any release yet, but in trunk - so will be in next release).


Hari  

--  
Hari Shreedharan


On Wednesday, January 9, 2013 at 2:12 AM, Chhaya Vishwakarma wrote:

> The expected output I pasted is from file only which I can see in file but while writing
to HDFS its giving some junk value and why I am not able to see timestamp and other log information
>   
> From: Bertrand Dechoux [mailto:dechouxb@gmail.com]  
> Sent: Wednesday, January 09, 2013 3:39 PM
> To: user@flume.apache.org (mailto:user@flume.apache.org)
> Subject: Re: flume to HDFS log event write
>   
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/SequenceFile.html
>  
> is a binary format. You may want to make flume ouput to a file or the console first.
> And then compare what you are expecting versus what you are getting.
>  
> Regards
>  
> Bertrand
> On Wed, Jan 9, 2013 at 11:02 AM, Chhaya Vishwakarma <Chhaya.Vishwakarma@lntinfotech.com
(mailto:Chhaya.Vishwakarma@lntinfotech.com)> wrote:
> hi,
>   
> I am using Flume log4j appender to write log events to HDFS but it contains some junk
value and I am not able to see anything other than log message no timestamp.
>   
> Here is my configuration
> Log4j.properties
>   
> log4j.logger.log4jExample= DEBUG,out2
> log4j.appender.out2 = org.apache.flume.clients.log4jappender.Log4jAppender
> log4j.appender.out2.Port = 41414
> log4j.appender.out2.Hostname = 172.20.104.223
>   
> here is agent configuration
> a1.sources = r1
> a1.sinks = k1
> a1.channels = c1
>   
> #sources
> a1.sources.r1.type = avro
> a1.sources.r1.bind =172.20.104.226
> a1.sources.r1.port= 41414
> a1.sources.r1.restart =true
> a1.sources.r1.batchsize=10000
>   
> # Describe the sink
> a1.sinks.k1.type = hdfs
> a1.sinks.k1.hdfs.path=hdfs://172.20.104.226:8020/flumeinput/%{host} (http://172.20.104.226:8020/flumeinput/%25%7bhost%7d)
> a1.sinks.k1.hdfs.file.Type=DataStream
> a1.sinks.k1.hdfs.writeFormat=Writable
> a1.sinks.k1.hdfs.rollCount=10000
> a1.sinks.k1.serializer=TEXT
>   
> # Use a channel which buffers events in memory
> a1.channels.c1.type = file
> a1.channels.c1.capacity = 10000
> a1.channels.c1.transactionCapacity = 10000
>   
> # Bind the source and sink to the channel
> a1.sources.r1.channels = c1
> a1.sinks.k1.channel = c1
>   
> Expected output
> [2013-01-09 15:15:45,457] - [main] DEBUG log4jExample Current data unavailalbe, using
cached values
> [2013-01-09 15:15:45,458] - [main] INFO  log4jExample Hello this is an info message
> [2013-01-09 15:15:45,460] - [main] ERROR log4jExample Dabase unavaliable, connetion lost
> [2013-01-09 15:15:45,461] - [main] WARN  log4jExample Attention!! Application running
in debugmode
> [2013-01-09 15:15:45,463] - [main] DEBUG log4jExample Current data unavailalbe, using
cached values
> [2013-01-09 15:15:45,465] - [main] INFO  log4jExample Hello this is an info message
> [2013-01-09 15:15:45,467] - [main] ERROR log4jExample Dabase unavaliable, connetion lost
> [2013-01-09 15:15:45,468] - [main] WARN  log4jExample Attention!! Application running
in debugmode
> [2013-01-09 15:15:45,470] - [main] DEBUG log4jExample Current data unavailalbe, using
cached values
>   
> But getting this  
> Output on HDFS
> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable������+�AE����9<‑��-Current
data unavailalbe, using cached values)<‑��Hello this is an info message.<‑��"Dabase
unavaliable, connetion lost8<‑��,Attention!! Application running in debugmode9<‑��-Current
data unavailalbe, using cached values)<‑��Hello this is an info message.<‑��"Dabase
unavaliable, connetion lost8<‑��,Attention!! Application running in debugmode9<‑��-Current
data unavailalbe, using cached values)<‑��‑Hello this is an info message.<‑��‑"Dabase
unavaliable, connetion lost8<‑��­,Attention!! Application running in debugmode9<‑��
-Current data unavailalbe, using cached values)<‑�� Hello this is an info message.<‑��!"Dabase
unavaliable, connetion lost8<‑��",Attention!! Application running in debugmode9<‑��"-Current
data unavailalbe, using cached values)<‑��#Hello this is an info message.<‑��#"Dabase
unavaliable, connetion lost8<‑��$,Attention!! Application running in debugmode9<‑��$-Current
data unavailalbe, using cached values)<‑��%Hello this is an info message.<‑��%"Dabase
unavaliable, connetion lost8<‑��%,Attention!! Application running in debugmode9<‑��&-Current
data unavailalbe, using cached values)<‑��&Hello this is an info message.<‑��&apos;"Dabase
unavaliable, connetion lost8<‑��(,Attention!! Application running in debugmode
>   
>   
>  
>   
>  
> The contents of this e-mail and any attachment(s) may contain confidential or privileged
information for the intended recipient(s). Unintended recipients are prohibited from taking
action on the basis of information in this e-mail and using or disseminating the information,
and must notify the sender and delete it from their system. L&T Infotech will not accept
responsibility or liability for the accuracy or completeness of, or the presence of any virus
or disabling code in this e-mail"
>  
>  
>  
>  
>  
> --  
> Bertrand Dechoux  


Mime
View raw message