Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BC9AFD25D for ; Fri, 3 Aug 2012 08:46:15 +0000 (UTC) Received: (qmail 20373 invoked by uid 500); 3 Aug 2012 08:46:15 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 20105 invoked by uid 500); 3 Aug 2012 08:46:13 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 20072 invoked by uid 99); 3 Aug 2012 08:46:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Aug 2012 08:46:12 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=FSL_RCVD_USER,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of birchall@infoscience.co.jp designates 219.101.133.187 as permitted sender) Received: from [219.101.133.187] (HELO filter12.asp.infoscience.co.jp) (219.101.133.187) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Aug 2012 08:46:05 +0000 Received: from filter12.asp.infoscience.co.jp (localhost [127.0.0.1]) by localhost (Postfix) with ESMTP id A1E192F735 for ; Fri, 3 Aug 2012 17:45:43 +0900 (JST) Received: from mail.infoscience.co.jp (mail.infoscience.co.jp [202.126.225.4]) by filter12.asp.infoscience.co.jp (Postfix) with SMTP id 8EDAD2F733 for ; Fri, 3 Aug 2012 17:45:43 +0900 (JST) Received: (qmail 3195 invoked from network); 3 Aug 2012 17:45:43 +0900 Received: (ofmipd 219.101.133.190); 3 Aug 2012 17:45:43 +0900 Date: 3 Aug 2012 17:45:39 +0900 Message-ID: <501B8FB3.3000909@infoscience.co.jp> From: "=?ISO-2022-JP?B?GyRCJVAhPCVBJWMlayEhJS8laiU5JUglVSUhITwbKEI=?=" To: user@flume.apache.org User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20120713 Thunderbird/14.0 MIME-Version: 1.0 Subject: Re: Writing reliably to HDFS References: <5019E0CC.5000303@infoscience.co.jp> <5019F7C9.9000506@cyberagent.co.jp> In-Reply-To: <5019F7C9.9000506@cyberagent.co.jp> Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit X-SPASIGN: Checked by 127.0.2.1 X-Virus-Checked: Checked by ClamAV on apache.org Juhani, Thanks for the advice. Just to clarify, when I talk about the agent "dying", I mean crashing or being killed unexpectedly. I'm worried about how the HDFS writing works in these cases. When the agent is shutdown cleanly, I can confirm that all HDFS files are closed correctly and no .tmp files are left lying around. In the case where the agent dies suddenly and zero-byte .tmp files are left over, I still haven't found a way to get Hadoop to fix those files for me. Chris. On 2012/08/02 12:45, Juhani Connolly wrote: > Hi Chris, > > Answers inline > > On 08/02/2012 11:07 AM, バーチャル クリストファー wrote: >> Hi, >> >> I'm trying to write events to HDFS using Flume 1.2.0 and I have a couple >> of questions. >> >> Firstly, about the reliability semantics of the HdfsEventSink. >> >> My number one requirement is reliability, i.e. not losing any events. >> Ideally, by the time the HdfsEventSink commits the transaction, all >> events should be safely written to HDFS and visible to other clients, so >> that no data is lost even if the agent dies after that point. But what >> is actually happening in my tests is as follows: >> >> 1. The HDFS sink takes some events from the FileChannel and writes them >> to a SequenceFile on HDFS >> 2. The sink commits the transaction, and the FileChannel updates its >> checkpoint. As far as FileChannel is concerned, the events have been >> safely written to the sink. >> 3. Kill the agent. >> >> Result: I'm left with a weird zero-byte, non-zero-byte tmp file on HDFS. >> The SequenceFile has not yet been closed and rolled over, so it is still >> a ".tmp" file. The data is actually in the HDFS blocks, but because the >> file was not closed, the NameNode thinks it has a length of 0 bytes. I'm >> not sure how to recover from this. >> >> Is this the expected behaviour of the HDFS sink, or am I doing something >> wrong? Do I need to explicitly enable HDFS append? (I am using HDFS >> 2.0.0-alpha) >> >> I guess the problem is that data is not "safely" written until file >> rollover occurs, but the timing of file rollover (by time, log count, >> file size, etc.) is unrelated to the timing of transactions. Is there >> any way to put these in sync with each other? > Regarding reliability, I believe that while the file may not be closed, > you're not actually at risk of losing data. I suspect that adding in > some code to the sink shutdown to close up any temp files may be a good > idea. To deal with unexpected failure it may even be an idea to try > scanning the dest path for any unclosed files on startup. > > I'm not really too familiar with the workings of hdfs sink so maybe > someone else can add more detail. In our test setup we have yet to have > any data loss from it. >> Second question: Could somebody please explain the reasoning behind the >> default values of the HDFS sink configuration? If I use the defaults, >> the sink generates zillions of tiny files (max 10 events per file), >> which as I understand it is not a recommended way to use HDFS. >> >> Is it OK to change these settings to generate much larger files (MB, GB >> scale)? Or should I write a script that periodically combines these tiny >> files into larger ones? >> >> Thanks for any advice, >> >> Chris Birchall. >> > There's no harm in changing those defaults and I'd strongly recommend > doing so. We have most of the rolls switched off(set to 0) and we just > roll hourly(because that's how we want to separate our logs). You may > also want to change the hdfs.batchSize which defaults to 1... Which is > gong to cause a bottleneck if you have even a moderate amount of > traffic. One thing to note is that with large batches, it's possible for > events to be duplicated(if the batch got partially written and then had > an error, it will get rollbacked at the channel and then rewritten). > > >