incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stuti Awasthi <Stuti_Awas...@persistent.co.in>
Subject RE: Problem in chukwa output
Date Sat, 29 May 2010 02:59:28 GMT
Hi,

Sorry for replying late I was trying with what you have suggested.
Yes  it worked for me. Rotation factor increased my file size but now have other issue :)

@Issue :

When chukwa demuxer  gets the log for the processing , it is getting distributed in 2 directories
:

1)      After correct processing , it generates .evt files.

2)      Chuwa parser does not parse the data properly and end up giving ..InError directory.

Rotation Time : 5 min to 1 Hour


1.     SYSTEM LOGS
Log File used : message1

Datatype used : SysLog

Error : java.text.ParseException: Unparseable date: "y  4 06:12:38 p"


2.     Hadoop Logs

Log File Used : Hadoop datanode logs , Hadoop TaskTracker logs
Datatype Used : HadoopLog

Error : java.text.ParseException: Unparseable date: "0 for block blk_1617125"


3.     Chuwa Agent Logs

Log File Used : Chuwa Agent logs

Datatype Used : chuwaAgent



Error : org.json.JSONException: A JSONObject text must begin with '{' at character 1 of post
thread ChukwaHttpSender - collected 1 chunks


I am wondering why data is getting into these INError directory. Is there any way we can get
 correct evt files after demuxing rather than these INError.evt files.

Thanks
Stuti
From: Jerome Boulon [mailto:jboulon@netflix.com]
Sent: Thursday, May 27, 2010 1:01 AM
To: chukwa-user@hadoop.apache.org
Subject: Re: Problem in chukwa output

Hi,
The demux is grouping you data per date/hour/TimeWindow so yes, 1 .done file could be split
into multiple .evt file depending on the content/timestamp of your data.
Generally, if you have a SysLogInError directory, it's because the parser throws an exception
and you should have some files in there.

You may want to take a look at this wiki page to get an idea of Demux data flow.
http://wiki.apache.org/hadoop/Chukwa_Processes_and_Data_Flow

Regards,
/Jerome.

On 5/26/10 10:55 AM, "Stuti Awasthi" <Stuti_Awasthi@persistent.co.in> wrote:
Hi all,
I am facing some problems in chukwa output.

The following are the process flow in Collector :
I worked with single .done file of 16MB in size for the analysis

1)     Logs were collected in /logs directory.

2)     After demux processing the output was stored in /repos directory.

Following is the structure inside  repos:        /repos
                                                                                         
      /SysLog                                     Total Size : 1MB
                                                                                         
                      /20100503/ *.evt
                                                                                         
                      /20100504/*.evt

/SysLogInError                        Total Size  : 15MB
                                                                                         
                                                                      /..../*.evt

I have 2 doubts :

I noticed that my single log file was spilt into multiple  .evt file. My output file contained
2 folders inside / SysLog .Is this the correct behaviour that a single .done file is split
into n number of .evt files and in different directory structure?

There was a directory of SysLogInError generated but there was no ERROR in the log file. I
was not sure when this directory gets created?

Any pointers will be helpful.
Thanks
Stuti
DISCLAIMER ========== This e-mail may contain privileged and confidential information which
is the property of Persistent Systems Ltd. It is intended only for the use of the individual
or entity to which it is addressed. If you are not the intended recipient, you are not authorized
to read, retain, copy, print, distribute or use this message. If you have received this communication
in error, please notify the sender and delete all copies of this message. Persistent Systems
Ltd. does not accept any liability for virus infected mails.

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent
Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed.
If you are not the intended recipient, you are not authorized to read, retain, copy, print,
distribute or use this message. If you have received this communication in error, please notify
the sender and delete all copies of this message. Persistent Systems Ltd. does not accept
any liability for virus infected mails.

Mime
View raw message