chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerome Boulon <jbou...@netflix.com>
Subject Re: 2 questions, the log file name and the log messy code
Date Fri, 19 Nov 2010 18:24:42 GMT
Just a warning if you are using Text output format then you will have some hard time with "\n"
inside your logs like stackTrace for example.
Also, text file will either be non-compressed or non-splittable.

/Jerome.

On 11/19/10 9:30 AM, "Eric Yang" <eyang@yahoo-inc.com> wrote:




On 11/19/10 12:37 AM, "Ying Tang" <ivytang0812@gmail.com> wrote:

Hi all ,
    1.   I have install 2 nodes chukwa for testing , one agent and one collector  . And also
i have an hdfs , but i found the log collected by the collector in hdfs , the file name is
          time+logsourcehost+java.rmi.server.UID()
          time's format is yyyyddHHmmssSSS , there is no month ? And this is been written
in the code .
    I      need the month  ,  so i must change the code and recompile it ?
    2.   And another question , the log content in the log file(in the hdfs) , the metadata
is messy code , the log content from the agent is ok.
          My adaptor is UTF8 , how to solve this?


 1.  Looks like a mistake on the temp filename.  Please open a jira and we will fix it.
 2.  The data is recorded in sequence file format to make the data easier to process with
mapreduce.  If you are expecting plain text of the log content, you will need to write a map/reduce
job with output format to text output format and channel the log files types according.

Regards,
Eric

Mime
View raw message