chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@yahoo-inc.com>
Subject Re: Adaptor options for Hadoop log files (TaskTracker, DataNode)
Date Mon, 18 May 2009 16:38:41 GMT



On 5/18/09 7:14 AM, "Jiaqi Tan" <tanjiaqi@gmail.com> wrote:

> Hi Eric,
> 
>> Chukwa has a special log4j appender which escapes return character.  The
>> multi-lines exception will be stored as a single chunk, and processed as a
>> single chukwa record after Demux.
> 
> In this case, I suppose I would need to configure the monitored Hadoop
> cluster to actually use the Chukwa log4j appender? Would I also need
> to recompile the Hadoop of the monitored cluster to include the Chukwa
> code then?

There is no need to recompile Hadoop.  Chukwa contains a jar file called
chukwa-hadoop-*-client.jar, and json.jar.  Drop those two jar files in lib
directory of the hadoop cluster, and configure log4j.properties in hadoop
conf directory.

> 
> Where are these record types defined, and how do they map the the
> processors? Is it a direct <record type name>Processor mapping that's
> automatically done by the Demux?

Record types are defined in the log4j.properties.  For example, hadoop has a
appender called DRFA, and the chukwa enabled appender would look like:

log4j.appender.DRFA=org.apache.hadoop.chukwa.inputtools.log4j.ChukwaDailyRol
lingFileAppender
log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
log4j.appender.DRFA.recordType=HadoopLog
log4j.appender.DRFA.chukwaClientHostname=localhost
log4j.appender.DRFA.chukwaClientPortNum=9093

The association of HadoopLog record type and the demux class is in
chukwa-demux-conf.xml.

Regards,
Eric

> 
>> 
>> On 5/17/09 6:42 PM, "Jiaqi Tan" <tanjiaqi@gmail.com> wrote:
>> 
>>> Hi Ariel,
>>> 
>>> So with the CharFileTailingAdaptorUTF8NewLineEscaped, if I have a log
>>> file entry with a multi-line entry, e.g. if there was a Java exception
>>> logged, would each line be separated into a different chunk? If that's
>>> the case, are there any adaptors that would coalesce multi-line log
>>> entries into a single chunk?
>>> 
>>> Also, does the data type get resolved by Demux to one of the classes
>>> in org.apache.hadoop.chukwa.extraction.demux.processor.mapper? i.e. if
>>> I wanted to implement my own custom datatype, I should create a Demux
>>> processor and stick it in as one of the classes in that package?
>>> 
>>> Thanks,
>>> Jiaqi
>>> 
>>> On Sun, May 17, 2009 at 6:19 PM, Ariel Rabkin <asrabkin@gmail.com> wrote:
>>>> It's worth distinguishing two different things.
>>>> 
>>>> The adaptor (as in CharFileTailingAdaptorUTF8) is responsible for
>>>> deciding how to break the data into chunks, and how to tag the chunks.
>>>>  Probably CharFileTailingAdaptorUTF8NewLineEscaped is right for you.
>>>> (We should really rename that to something shorter!)
>>>> 
>>>> The type, like SysLog or NameNodeLog, is stored by the adaptor, and
>>>> passed through as Chunk metadata. It's used to tell the Demux how to
>>>> process that data.  The demux-conf has the mapping from datatype to
>>>> processor.  For logs, you should be fine just picking a datatype.  If
>>>> you aren't using Demux to process the logs, you don't even need to
>>>> write a processor.
>>>> 
>>>> --Ari
>>>> 
>>>> On Sun, May 17, 2009 at 6:15 PM, Jiaqi Tan <tanjiaqi@gmail.com> wrote:
>>>>> Hi,
>>>>> 
>>>>> Which adaptor should I use if I want to process log entries from the
>>>>> TaskTracker and DataNode logs? Should I just use one of the
>>>>> FileTailer adaptors already available (CharFileTailingAdaptorUTF8), or
>>>>> is there a custom type such as the one for SysLog or NameNodeLog when
>>>>> using the CharFileTailingAdaptorUTF8NewLineEscaped adaptor?
>>>>> 
>>>>> Is there any documentation available on what the "type" (e.g. SysLog
>>>>> or NameNodeLog) means and how to use it/how it works?
>>>>> 
>>>>> Thanks,
>>>>> Jiaqi
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Ari Rabkin asrabkin@gmail.com
>>>> UC Berkeley Computer Science Department
>>>> 
>> 
>> 


Mime
View raw message