Return-Path: X-Original-To: apmail-incubator-chukwa-user-archive@www.apache.org Delivered-To: apmail-incubator-chukwa-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C6E19835 for ; Wed, 16 May 2012 06:52:06 +0000 (UTC) Received: (qmail 28944 invoked by uid 500); 16 May 2012 06:52:06 -0000 Delivered-To: apmail-incubator-chukwa-user-archive@incubator.apache.org Received: (qmail 28927 invoked by uid 500); 16 May 2012 06:52:06 -0000 Mailing-List: contact chukwa-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: chukwa-user@incubator.apache.org Delivered-To: mailing list chukwa-user@incubator.apache.org Received: (qmail 28904 invoked by uid 99); 16 May 2012 06:52:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 May 2012 06:52:05 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of asrabkin@gmail.com designates 209.85.214.175 as permitted sender) Received: from [209.85.214.175] (HELO mail-ob0-f175.google.com) (209.85.214.175) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 May 2012 06:51:59 +0000 Received: by obhx4 with SMTP id x4so693931obh.6 for ; Tue, 15 May 2012 23:51:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=dwx1A68LSrYV0hf0VqRgZxo8i7lfCvB0cvoeUkiWSOM=; b=FqiWkAvLp2VtY/QMtvaH7MnA2wqLFzpVLgJtD1q9nO9WyqHh5iTVhVK1KEm7ecsI2k kuHLRTCOlWx6gnluUGWk40P4qbxEBPrXCBLxHKjSgOhW1nk0ti3KEhczMiKl0PR1kUuR dA8k1Jdm2eRCIol9eGtWvK4TSRANW5++CN1Zov1/8RykeAv8rZTIQztec0exVUjfpSBM Wb55yQrtN2ZgqKXYlUv207L5ggJsLgf4VgUVY6M9ay1F7dxV2uZGRqV6zW918j7bmIXR TNmWtSfDJpnMxREZ7m2d3YAOGaEz8qkf8brO22fzlkNLyd34mQfuI2Vwa7CK8eY1iGqP mosg== MIME-Version: 1.0 Received: by 10.182.119.33 with SMTP id kr1mr1662871obb.60.1337151098507; Tue, 15 May 2012 23:51:38 -0700 (PDT) Received: by 10.182.198.38 with HTTP; Tue, 15 May 2012 23:51:38 -0700 (PDT) In-Reply-To: References: Date: Tue, 15 May 2012 23:51:38 -0700 Message-ID: Subject: Re: the check point offset is bigger than the log file size From: Ariel Rabkin To: chukwa-user@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Rotation is a bit of a mess. We've tried a couple strategies to handle it, none of which are perfect. One approach is to have a modified logger that explicitly invokes chukwa, starting and stopping adaptors. The other is that the FileTailingAdaptors keep not only a physical "how long is the file" offset, but a logical "what is the byte number of the first byte of the file" -- the idea is that if the file rotates, the adaptor should add the length of the rotated-out section to the length of the current file. This is a bit fragile, since the adaptor has to guess which was the previously-rotated file. I believe we use timestamps for that. I suspect it won't always work. --Ari On Tue, May 15, 2012 at 11:45 PM, IvyTang wrote: > =A0 =A0 After reading the source code ,i'm confuesd about the checkpoint = file . > > =A0 =A0 The file tailer generate the chunks into the memlimitqueue, the > httpsender get the chunks to send from the=A0 memlimitqueue. And after th= e > httpsender send the chunks to collector=A0reliably ,the=A0reportCommit(Ad= aptor > src, long uuid) will be called. > > =A0=A0 In this=A0reportCommit(Adaptor src, long uuid) method, the src is = the > adaptor , the uuid is the offset of those chunks which have beend in the > file .And if the uuid is > =A0adaptor.offset , the means some chunks have= been > sent , so the=A0adaptor.offset is assigned to the uuid. > > =A0 This works file when the log file is =A0not=A0rotating . > > =A0 =A0 But if the log file is rotating(i mean the way like log4j , move = this > file to *.1 and generate a file named *), the=A0=A0adaptor.offset is the = offset > of those chunks been sent in last file , it's of course very big . but uu= id > is the offset of chunks been sent of this file , the uuid is smaller the > the=A0adaptor.offset=A0. > > =A0 =A0 So the checkpoint file won't change . > > =A0 =A0 Even though chukwa is still sending chunks to collector , but if = chukwa > restarted , the checkpoint is larger than the log file size , the log fil= e > will be sent again. > > > > On Mon, May 14, 2012 at 7:01 PM, IvyTang wrote: >> >> The gamelog size is=A0158023223, but the check point file is >> >> ADD adaptor_2963225a90653a309cf779d4a1d815a3 =3D >> org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTaili= ngAdaptorUTF8 >> Gamelog 0 /var/log/dataproxy/gamelog 229406124 >> >> The gamelog didn't rotate , i'm sure. >> >> But the check point file size is bigger than the file size , the chukwa >> WARN Thread-2 FileTailingAdaptor - >> Adaptor|adaptor_2963225a90653a309cf779d4a1d815a3| file: >> /var/log/dataproxy/gamelog, has rotated and no detection - reset counter= s to >> 0L >> And the agent began to transfer the whole log file. >> >> I just feel confused why agent generate a offset size is bigger than the >> log size when the gamelog did not rotate. >> >> The chukwa version is 0.4.0 >> >> -- >> Best regards, >> >> Ivy Tang >> >> >> > > > > -- > Best regards, > > Ivy Tang > > > --=20 Ari Rabkin asrabkin@gmail.com UC Berkeley Computer Science Department