Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D2A54DB8D for ; Fri, 19 Oct 2012 15:15:53 +0000 (UTC) Received: (qmail 76357 invoked by uid 500); 19 Oct 2012 15:15:53 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 76168 invoked by uid 500); 19 Oct 2012 15:15:52 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 76155 invoked by uid 99); 19 Oct 2012 15:15:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Oct 2012 15:15:52 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of brock@cloudera.com designates 209.85.212.51 as permitted sender) Received: from [209.85.212.51] (HELO mail-vb0-f51.google.com) (209.85.212.51) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Oct 2012 15:15:45 +0000 Received: by mail-vb0-f51.google.com with SMTP id fn1so547529vbb.38 for ; Fri, 19 Oct 2012 08:15:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=+WHoUhvgIjU+jzp5AJGEAF61b/eJzNV8j6+QzOedsFE=; b=a9xYsTxptJW2yKomL4It7VWZt1ma5qRMNj0mKQW4yJsTCMy6ZBSX2McRf177PeAwtr G6rdnK5xPZIfg7wm6iAE4j0G5jfIYwjHPxtN6LyFR7ZAy7/9Fw+UtygCveKE/DKjCBcR Dwt10aU8jflvSnygPzEjS+xcTzJdG73QldLsz5RdM2syORGS2zv60rZ6L/jvVxREqnMC wXrUadWYQOdDV0uxSIGg2R+cP7ViB0oXHWBdOacKSwFd4cKWb8mao+2YMvqZ2evVfETG Hq14cIrYlHbG8Orxt6KkvSVYR2VfDFOHPdxKul0Z4Yo9UYA7SOCizmCNpO8/5ob4Ddyg 5PMg== Received: by 10.52.179.231 with SMTP id dj7mr1543441vdc.108.1350659723927; Fri, 19 Oct 2012 08:15:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.58.179.43 with HTTP; Fri, 19 Oct 2012 08:15:03 -0700 (PDT) In-Reply-To: References: From: Brock Noland Date: Fri, 19 Oct 2012 10:15:03 -0500 Message-ID: Subject: Re: flume restart exception To: user@flume.apache.org Content-Type: multipart/alternative; boundary=bcaec5014c710e9fb004cc6af94b X-Gm-Message-State: ALoCoQnNVtUkYRs5tBwd9ce4q0r1T5qTPFu6mHBsQWmkaXBTCe0RDGwaxIIUzbHe9vcT9AUCdkRm X-Virus-Checked: Checked by ClamAV on apache.org --bcaec5014c710e9fb004cc6af94b Content-Type: text/plain; charset=ISO-8859-1 The problem is that 1.2 writes first the marker for the event and then the event. So it happens that sometimes the marker is written without the event. The patch is small and in that jira Hari mentioned. On Fri, Oct 19, 2012 at 4:19 AM, Raymond Ng wrote: > thanks for the replies > > I'm using Flume 1.2, and I'll look into getting 1.3 after assessing how > much rework is needed to port the local customised changes from 1.2 to 1.3 > > also does the problem with the trailing 7F always happen when flume is > killed ungracefully? > > Ray > > > On Thu, Oct 18, 2012 at 5:54 PM, Brock Noland wrote: > >> Hari is correct, but that won't fix that log file. To do that, you >> need to truncate the log file. If you do a hexdump, there should be a >> trailing 7F. That is a 7F with the rest of the file being 80. >> >> somedata....7F808080... >> >> That trailing 7F is the problem. Be sure to back up the log file >> before truncating it. >> >> Brock >> >> On Thu, Oct 18, 2012 at 11:48 AM, Hari Shreedharan >> wrote: >> > Raymond, >> > >> > This was an issue which we fixed (FLUME-1380) and will be part of >> > Flume-1.3.0. If you need it immediately, you could clone >> trunk/flume-1.3.0 >> > branch and build it. You should not see this error once you do that. Or >> you >> > can wait for the next release - Flume-1.3.0, which should be released >> in the >> > next few weeks or so. >> > >> > >> > Thanks >> > Hari >> > >> > -- >> > Hari Shreedharan >> > >> > On Thursday, October 18, 2012 at 9:32 AM, Raymond Ng wrote: >> > >> > Hi >> > >> > I'm getting the following exception when restarting flume after it was >> > killed manually, please advice >> > >> > >> > 2012-10-18 17:04:33,889 INFO [conf-file-poller-0] >> > DefaultLogicalNodeManager.java - Starting Channel fileChannel4 >> > 2012-10-18 17:04:33,914 INFO [lifecycleSupervisor-1-4] >> > DirectMemoryUtils.java - Unable to get maxDirectMemory from VM: >> > NoSuchMethodException: sun.misc.VM.maxDirectMemory(null) >> > 2012-10-18 17:04:33,916 INFO [lifecycleSupervisor-1-4] >> > DirectMemoryUtils.java - Direct Memory Allocation: Allocation = >> 1048576, >> > Allocated = 0, MaxDirectMemorySize = 1908932608, Remaining = 1908932608 >> > 2012-10-18 17:04:34,058 INFO [lifecycleSupervisor-1-0] >> > MonitoredCounterGroup.java - Component type: CHANNEL, name: fileChannel1 >> > started >> > 2012-10-18 17:04:34,058 INFO [lifecycleSupervisor-1-2] >> > MonitoredCounterGroup.java - Component type: CHANNEL, name: fileChannel2 >> > started >> > 2012-10-18 17:04:40,835 ERROR [lifecycleSupervisor-1-1] Log.java - >> Failed to >> > initialize Log >> > java.io.IOException: Header 80808080 not expected value: deadbeef >> > at >> > >> org.apache.flume.channel.file.TransactionEventRecord.fromDataInput(TransactionEventRecord.java:136) >> > at >> > >> org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:378) >> > at >> > >> org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:88) >> > at org.apache.flume.channel.file.Log.replay(Log.java:251) >> > at >> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:228) >> > at >> > >> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:237) >> > at >> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) >> > at >> > >> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) >> > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) >> > at >> > >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) >> > at >> > >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) >> > at >> > >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> > at java.lang.Thread.run(Thread.java:662) >> > 2012-10-18 17:04:40,836 ERROR [lifecycleSupervisor-1-1] >> FileChannel.java - >> > Failed to start the file channel >> > java.io.IOException: Header 80808080 not expected value: deadbeef >> > at >> > >> org.apache.flume.channel.file.TransactionEventRecord.fromDataInput(TransactionEventRecord.java:136) >> > at >> > >> org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:378) >> > at >> > >> org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:88) >> > at org.apache.flume.channel.file.Log.replay(Log.java:251) >> > at >> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:228) >> > at >> > >> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:237) >> > at >> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) >> > at >> > >> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) >> > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) >> > at >> > >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) >> > at >> > >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) >> > at >> > >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> > at java.lang.Thread.run(Thread.java:662) >> > 2012-10-18 17:04:53,740 INFO [lifecycleSupervisor-1-4] >> > MonitoredCounterGroup.java - Component type: CHANNEL, name: fileChannel4 >> > started >> > 2012-10-18 17:04:53,740 INFO [conf-file-poller-0] >> > DefaultLogicalNodeManager.java - Starting Sink hdfsSink4 >> > 2012-10-18 17:04:53,741 INFO [conf-file-poller-0] >> > DefaultLogicalNodeManager.java - Starting Sink hdfsSink2 >> > >> > ..... >> > ..... >> > 2012-10-18 17:10:36,789 ERROR >> > [SinkRunner-PollingRunner-DefaultSinkProcessor] SinkRunner.java - >> Unable to >> > deliver event. Exception follows. >> > java.lang.IllegalStateException: Channel closed [channel=fileChannel3] >> > at >> > com.google.common.base.Preconditions.checkState(Preconditions.java:145) >> > at >> > >> org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:266) >> > at >> > >> org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:118) >> > at com.test.CustomHDFSSink.process(CustomHDFSSink.java:428) >> > at >> > >> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) >> > at >> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) >> > at java.lang.Thread.run(Thread.java:662) >> > 2012-10-18 17:10:41,790 ERROR >> > [SinkRunner-PollingRunner-DefaultSinkProcessor] SinkRunner.java - >> Unable to >> > deliver event. Exception follows. >> > java.lang.IllegalStateException: Channel closed [channel=fileChannel3] >> > at >> > com.google.common.base.Preconditions.checkState(Preconditions.java:145) >> > at >> > >> org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:266) >> > at >> > >> org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:118) >> > at com.test.CustomHDFSSink.process(CustomHDFSSink.java:428) >> > at >> > >> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) >> > at >> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) >> > at java.lang.Thread.run(Thread.java:662) >> > >> > >> > >> > -- >> > Rgds >> > Ray >> > >> > >> >> >> >> -- >> Apache MRUnit - Unit testing MapReduce - >> http://incubator.apache.org/mrunit/ >> > > > > -- > Rgds > Ray > -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ --bcaec5014c710e9fb004cc6af94b Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable The problem is that 1.2 writes first the marker for the event and then the = event. So it happens that sometimes the marker is written without the event= . The patch is small and in that jira Hari mentioned.

On Fri, Oct 19, 2012 at 4:19 AM, Raymond Ng <raymondair@gmail.com= > wrote:
thanks for the replies

I'm using Flume 1.2, and I'll look i= nto getting 1.3 after assessing how much rework is needed to port the local= customised changes from 1.2 to 1.3

also does the problem with the t= railing 7F always happen when flume is killed ungracefully?

Ray


On Thu, Oct 18, 2012 at 5:54 PM, Brock Noland <= brock@cloudera.com<= /a>> wrote:
Hari is correct, but that won't fix that log file. To do that, you
need to truncate the log file. If you do a hexdump, there should be a
trailing 7F. That is a 7F with the rest of the file being 80.

somedata....7F808080...

That trailing 7F is the problem. =A0Be sure to back up the log file
before truncating it.

Brock

On Thu, Oct 18, 2012 at 11:48 AM, Hari Shreedharan
<
hshreedh= aran@cloudera.com> wrote:
> Raymond,
>
> This was an issue which we fixed (FLUME-1380) and will be part of
> Flume-1.3.0. If you need it immediately, you could clone trunk/flume-1= .3.0
> branch and build it. You should not see this error once you do that. O= r you
> can wait for the next release - Flume-1.3.0, which should be released = in the
> next few weeks or so.
>
>
> Thanks
> Hari
>
> --
> Hari Shreedharan
>
> On Thursday, October 18, 2012 at 9:32 AM, Raymond Ng wrote:
>
> Hi
>
> I'm getting the following exception when restarting flume after it= was
> killed manually, please advice
>
>
> 2012-10-18 17:04:33,889 =A0INFO [conf-file-poller-0]
> DefaultLogicalNodeManager.java - Starting Channel fileChannel4
> 2012-10-18 17:04:33,914 =A0INFO [lifecycleSupervisor-1-4]
> DirectMemoryUtils.java - Unable to get maxDirectMemory from VM:
> NoSuchMethodException: sun.misc.VM.maxDirectMemory(null)
> 2012-10-18 17:04:33,916 =A0INFO [lifecycleSupervisor-1-4]
> DirectMemoryUtils.java - Direct Memory Allocation: =A0Allocation =3D 1= 048576,
> Allocated =3D 0, MaxDirectMemorySize =3D 1908932608, Remaining =3D 190= 8932608
> 2012-10-18 17:04:34,058 =A0INFO [lifecycleSupervisor-1-0]
> MonitoredCounterGroup.java - Component type: CHANNEL, name: fileChanne= l1
> started
> 2012-10-18 17:04:34,058 =A0INFO [lifecycleSupervisor-1-2]
> MonitoredCounterGroup.java - Component type: CHANNEL, name: fileChanne= l2
> started
> 2012-10-18 17:04:40,835 ERROR [lifecycleSupervisor-1-1] Log.java - Fai= led to
> initialize Log
> java.io.IOException: Header 80808080 not expected value: deadbeef
> =A0 =A0 at
> org.apache.flume.channel.file.TransactionEventRecord.fromDataInput(Tra= nsactionEventRecord.java:136)
> =A0 =A0 at
> org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.ja= va:378)
> =A0 =A0 at
> org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.ja= va:88)
> =A0 =A0 at org.apache.flume.channel.file.Log.replay(Log.java:251)
> =A0 =A0 at org.apache.flume.channel.file.FileChannel.start(FileChannel= .java:228)
> =A0 =A0 at
> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(Lif= ecycleSupervisor.java:237)
> =A0 =A0 at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441= )
> =A0 =A0 at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:= 317)
> =A0 =A0 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java= :150)
> =A0 =A0 at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a= ccess$101(ScheduledThreadPoolExecutor.java:98)
> =A0 =A0 at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r= unPeriodic(ScheduledThreadPoolExecutor.java:180)
> =A0 =A0 at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r= un(ScheduledThreadPoolExecutor.java:204)
> =A0 =A0 at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecu= tor.java:886)
> =A0 =A0 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.= java:908)
> =A0 =A0 at java.lang.Thread.run(Thread.java:662)
> 2012-10-18 17:04:40,836 ERROR [lifecycleSupervisor-1-1] FileChannel.ja= va -
> Failed to start the file channel
> java.io.IOException: Header 80808080 not expected value: deadbeef
> =A0 =A0 at
> org.apache.flume.channel.file.TransactionEventRecord.fromDataInput(Tra= nsactionEventRecord.java:136)
> =A0 =A0 at
> org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.ja= va:378)
> =A0 =A0 at
> org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.ja= va:88)
> =A0 =A0 at org.apache.flume.channel.file.Log.replay(Log.java:251)
> =A0 =A0 at org.apache.flume.channel.file.FileChannel.start(FileChannel= .java:228)
> =A0 =A0 at
> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(Lif= ecycleSupervisor.java:237)
> =A0 =A0 at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441= )
> =A0 =A0 at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:= 317)
> =A0 =A0 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java= :150)
> =A0 =A0 at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a= ccess$101(ScheduledThreadPoolExecutor.java:98)
> =A0 =A0 at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r= unPeriodic(ScheduledThreadPoolExecutor.java:180)
> =A0 =A0 at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r= un(ScheduledThreadPoolExecutor.java:204)
> =A0 =A0 at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecu= tor.java:886)
> =A0 =A0 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.= java:908)
> =A0 =A0 at java.lang.Thread.run(Thread.java:662)
> 2012-10-18 17:04:53,740 =A0INFO [lifecycleSupervisor-1-4]
> MonitoredCounterGroup.java - Component type: CHANNEL, name: fileChanne= l4
> started
> 2012-10-18 17:04:53,740 =A0INFO [conf-file-poller-0]
> DefaultLogicalNodeManager.java - Starting Sink hdfsSink4
> 2012-10-18 17:04:53,741 =A0INFO [conf-file-poller-0]
> DefaultLogicalNodeManager.java - Starting Sink hdfsSink2
>
> .....
> .....
> 2012-10-18 17:10:36,789 ERROR
> [SinkRunner-PollingRunner-DefaultSinkProcessor] SinkRunner.java - Unab= le to
> deliver event. Exception follows.
> java.lang.IllegalStateException: Channel closed [channel=3DfileChannel= 3]
> =A0 =A0 at
> com.google.common.base.Preconditions.checkState(Preconditions.java:145= )
> =A0 =A0 at
> org.apache.flume.channel.file.FileChannel.createTransaction(FileChanne= l.java:266)
> =A0 =A0 at
> org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicCha= nnelSemantics.java:118)
> =A0 =A0 at com.test.CustomHDFSSink.process(CustomHDFSSink.java:428) > =A0 =A0 at
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcesso= r.java:68)
> =A0 =A0 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.ja= va:147)
> =A0 =A0 at java.lang.Thread.run(Thread.java:662)
> 2012-10-18 17:10:41,790 ERROR
> [SinkRunner-PollingRunner-DefaultSinkProcessor] SinkRunner.java - Unab= le to
> deliver event. Exception follows.
> java.lang.IllegalStateException: Channel closed [channel=3DfileChannel= 3]
> =A0 =A0 at
> com.google.common.base.Preconditions.checkState(Preconditions.java:145= )
> =A0 =A0 at
> org.apache.flume.channel.file.FileChannel.createTransaction(FileChanne= l.java:266)
> =A0 =A0 at
> org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicCha= nnelSemantics.java:118)
> =A0 =A0 at com.test.CustomHDFSSink.process(CustomHDFSSink.java:428) > =A0 =A0 at
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcesso= r.java:68)
> =A0 =A0 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.ja= va:147)
> =A0 =A0 at java.lang.Thread.run(Thread.java:662)
>
>
>
> --
> Rgds
> Ray
>
>



--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/



--
Rgds
Ray



-- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ --bcaec5014c710e9fb004cc6af94b--