Return-Path: Delivered-To: apmail-incubator-chukwa-user-archive@www.apache.org Received: (qmail 17522 invoked from network); 23 Nov 2010 07:50:01 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 23 Nov 2010 07:50:01 -0000 Received: (qmail 81464 invoked by uid 500); 23 Nov 2010 07:50:32 -0000 Delivered-To: apmail-incubator-chukwa-user-archive@incubator.apache.org Received: (qmail 81417 invoked by uid 500); 23 Nov 2010 07:50:32 -0000 Mailing-List: contact chukwa-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: chukwa-user@incubator.apache.org Delivered-To: mailing list chukwa-user@incubator.apache.org Received: (qmail 81409 invoked by uid 99); 23 Nov 2010 07:50:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Nov 2010 07:50:32 +0000 X-ASF-Spam-Status: No, hits=4.4 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ivytang0812@gmail.com designates 209.85.214.175 as permitted sender) Received: from [209.85.214.175] (HELO mail-iw0-f175.google.com) (209.85.214.175) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Nov 2010 07:50:25 +0000 Received: by iwn7 with SMTP id 7so793658iwn.6 for ; Mon, 22 Nov 2010 23:50:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=wjIjPl4rFvGadEUTaliM97l0hSz2z0CAyZZT782XNOU=; b=NUHOUBRgoiGUVBku5eP5Sof2Fv/ZHUwTxSS+2EVxQLfBKrUsi/+IW8DGWp+jIV0EcG Zo0FnjL2oY0NTbMNBA8Ow5CVcd0WXHqiQQVkn+5tj218W7x7Pug037jd/kHDSywt2e3j s5vWCg/sFxy8bWeLmTEHr1sJ8GbaQ6bmiuWYU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=tq4wW8cCh6OXTv7bzYVfG4docoSW03YAf9Bwsyvbs2e6Y0akATnuHfdCld3+WGC/TA wu250yEheY+27/3QzThOlXe+Trk2B0x8pg3/E9A67GJ9jeoEYs7r6agJvqQryCZQsrqD vkxgWKVxht0GRFYdbQNwD9Ljz60EEbZ22r03E= Received: by 10.231.59.12 with SMTP id j12mr8097930ibh.11.1290498604239; Mon, 22 Nov 2010 23:50:04 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.184.147 with HTTP; Mon, 22 Nov 2010 23:49:43 -0800 (PST) In-Reply-To: References: From: Ying Tang Date: Tue, 23 Nov 2010 15:49:43 +0800 Message-ID: Subject: Re: 2 questions, the log file name and the log messy code To: chukwa-user@incubator.apache.org Content-Type: multipart/alternative; boundary=001485ea8afee38a390495b39eaa X-Virus-Checked: Checked by ClamAV on apache.org --001485ea8afee38a390495b39eaa Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable The messy code is my mistake. After using the SequenceFileInputFormat ,the file is clear . But the metadata in value is mixed with my log . Add a \n after the metadata is better. On Sat, Nov 20, 2010 at 2:24 AM, Jerome Boulon wrote: > Just a warning if you are using Text output format then you will have som= e > hard time with =93\n=94 inside your logs like stackTrace for example. > Also, text file will either be non-compressed or non-splittable. > > /Jerome. > > > On 11/19/10 9:30 AM, "Eric Yang" wrote: > > > > > On 11/19/10 12:37 AM, "Ying Tang" wrote: > > Hi all , > 1. I have install 2 nodes chukwa for testing , one agent and one > collector . And also i have an hdfs , but i found the log collected by t= he > collector in hdfs , the file name is > time+logsourcehost+java.rmi.server.UID() > time's format is yyyyddHHmmssSSS , there is no month ? And this > is been written in the code . > I need the month , so i must change the code and recompile it = ? > 2. And another question , the log content in the log file(in the > hdfs) , the metadata is messy code , the log content from the agent is ok= . > My adaptor is UTF8 , how to solve this? > > > > 1. Looks like a mistake on the temp filename. Please open a jira and > we will fix it. > 2. The data is recorded in sequence file format to make the data easie= r > to process with mapreduce. If you are expecting plain text of the log > content, you will need to write a map/reduce job with output format to= text > output format and channel the log files types according. > > > Regards, > Eric > > --=20 Best regards, Ivy Tang --001485ea8afee38a390495b39eaa Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
The messy code is my mistake.
After using the SequenceFileInputFormat=A0 ,the file is clear .
But the metadata in value is mixed with my log .
Add a \n after the metadata is better.

On Sat, Nov 20, 2010 at 2:24 AM, Jerome Boulon <= span dir=3D"ltr"><jboulon@netflix= .com> wrote:
Just a warning if you are using Text output format then you wil= l have some hard time with =93\n=94 inside your logs like stackTrace for ex= ample.
Also, text file will either be non-compressed or non-splittable.

/Jerome.
=20


On 11/19/10 9:30 AM, "Eric Yang" <eyang@yahoo-inc.com<= /a>> wrote:




On 11/19/10 12:37 AM, "Ying Tang"= ; <ivytang081= 2@gmail.com> wrote:

Hi all ,
=A0=A0=A0 1.=A0=A0 I have install 2 nodes= chukwa for testing , one agent and one collector=A0 . And also i have an h= dfs , but i found the log collected by the collector in hdfs , the file nam= e is
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0
time+logsourcehost+java.rmi.server.U= ID()
=A0=A0=A0=A0=A0=A0=A0=A0 =A0time's format is yyyyddHHmmssSSS , = there is no month ? And this is been written in the code .
=A0=A0=A0=A0I=A0=A0=A0=A0 =A0need the month=A0 ,=A0 so i must change the co= de and recompile it ?
=A0=A0=A0 2.=A0=A0 And another question , the log = content in the log file(in the hdfs) , the metadata is messy code , the log= content from the agent is ok.
=A0=A0=A0=A0=A0=A0=A0=A0=A0 My adaptor is UTF8 , how to solve this?

  1. Looks like a mistake on the temp filename. =A0Please open a jira= and we will fix it.
  2. The data is recorded in sequence file format to make the data ea= sier to process with mapreduce. =A0If you are expecting plain text of the l= og content, you will need to write a map/reduce job with output format to t= ext output format and channel the log files types according.

Regards,
Eric
<= /blockquote>



= --
Best regards,

Ivy Tang



--001485ea8afee38a390495b39eaa--