Return-Path: Delivered-To: apmail-hadoop-chukwa-dev-archive@minotaur.apache.org Received: (qmail 23269 invoked from network); 4 Apr 2010 18:07:51 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 4 Apr 2010 18:07:51 -0000 Received: (qmail 2755 invoked by uid 500); 4 Apr 2010 18:07:51 -0000 Delivered-To: apmail-hadoop-chukwa-dev-archive@hadoop.apache.org Received: (qmail 2710 invoked by uid 500); 4 Apr 2010 18:07:51 -0000 Mailing-List: contact chukwa-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: chukwa-dev@hadoop.apache.org Delivered-To: mailing list chukwa-dev@hadoop.apache.org Received: (qmail 2702 invoked by uid 99); 4 Apr 2010 18:07:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 04 Apr 2010 18:07:50 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 04 Apr 2010 18:07:48 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 52A96234C4A8 for ; Sun, 4 Apr 2010 18:07:27 +0000 (UTC) Message-ID: <854652961.681041270404447337.JavaMail.jira@brutus.apache.org> Date: Sun, 4 Apr 2010 18:07:27 +0000 (UTC) From: "Ari Rabkin (JIRA)" To: chukwa-dev@hadoop.apache.org Subject: [jira] Commented: (CHUKWA-4) Collectors don't finish writing .done datasink from last .chukwa datasink when stopped using bin/stop-collectors MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CHUKWA-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853272#action_12853272 ] Ari Rabkin commented on CHUKWA-4: --------------------------------- I don't think you need to identify the bad chunk. Why not just move the "close new file and rename things" logic into the exception handler? I would split the handler into "ChecksumException" -- where you recover by closing and renaming the file -- and everything else, which you should log the way you do now. > Collectors don't finish writing .done datasink from last .chukwa datasink when stopped using bin/stop-collectors > ---------------------------------------------------------------------------------------------------------------- > > Key: CHUKWA-4 > URL: https://issues.apache.org/jira/browse/CHUKWA-4 > Project: Hadoop Chukwa > Issue Type: Bug > Components: data collection > Environment: I am running on our local cluster. This is a linux machine that I also run Hadoop cluster from. > Reporter: Andy Konwinski > Priority: Minor > > When I use start-collectors, it creates the datasink as expected, writes to it as per normal, i.e. writes to the .chukwa file, and roll overs work fine when it renames the .chukwa file to .done. However, when I use bin/stop-collectors to shut down the running collector it leaves a .chukwa file in the HDFS file system. Not sure if this is a valid sink or not, but I think that the collector should gracefully clean up the datasink and rename it .done before exiting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.