flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: Exception : Log File is null for id
Date Fri, 05 Oct 2012 18:28:37 GMT
Brock,  

This looks like FLUME-1417. This logs on the jira show when the problem is hit during startup.
I actually managed to get the Log Id is null error during runtime when I was testing that
issue, if you change to small file size and checkpoint very often. 

Thanks,
Hari

-- 
Hari Shreedharan


On Friday, October 5, 2012 at 11:19 AM, Brock Noland wrote:

> Hi,
> 
> Just curious if you got around this or figured out what was going on?
> Makes me a little nervous about a file channel bug.
> 
> Brock
> 
> On Tue, Oct 2, 2012 at 6:28 AM, Brock Noland <brock@cloudera.com (mailto:brock@cloudera.com)>
wrote:
> > Also, if you could send us your full log that would be great. The
> > email list doesn't take attachements so either:
> > 
> > 1) post it on pastbin
> > or
> > 2) zip it and mail it to me directly
> > 
> > Brock
> > 
> > On Tue, Oct 2, 2012 at 6:06 AM, Brock Noland <brock@cloudera.com (mailto:brock@cloudera.com)>
wrote:
> > > Hi,
> > > 
> > > What version of flume? If trunk (1.3.0-SNAPSHOT) what is the last
> > > patch you have?
> > > 
> > > Can you how us a ls -la of your data and checkpoint directories?
> > > 
> > > Brock
> > > 
> > > On Tue, Oct 2, 2012 at 3:43 AM, Raymond Ng <raymondair@gmail.com (mailto:raymondair@gmail.com)>
wrote:
> > > > Just to add more info to this, I've checked the File channel where a
> > > > "ChannelException: Cannot acquire capacity" is reported against, and can
see
> > > > file log-1 has the size of 0 and log-2 has over 300 MB of data, comparing
> > > > with another File channel which has files log-2 and log-3 both with data
in
> > > > it but no file log-1 is found.
> > > > 
> > > > sounds like log-1 is the one causing the "NullPointerException: LogFile
is
> > > > null for id 1" below, and when I restarted flume, I get the following
> > > > warning. I can confirm there was no manual tampering in the file channel
> > > > directory
> > > > 
> > > > 2012-10-02 09:38:10,231 INFO [conf-file-poller-0]
> > > > DefaultLogicalNodeManager.java - Starting Channel probeFileChannel1
> > > > 2012-10-02 09:38:10,239 INFO [conf-file-poller-0]
> > > > DefaultLogicalNodeManager.java - Starting Channel probeFileChannel3
> > > > 2012-10-02 09:38:10,313 WARN [lifecycleSupervisor-1-2] ReplayHandler.java
-
> > > > Hit EOF on /home/user/flume-ng/filechannel3/data/log-1
> > > > 2012-10-02 09:38:10,314 INFO [lifecycleSupervisor-1-1]
> > > > DirectMemoryUtils.java - Unable to get maxDirectMemory from VM:
> > > > NoSuchMethodException: sun.misc.VM.maxDirectMemory(null)
> > > > 2012-10-02 09:38:10,317 INFO [lifecycleSupervisor-1-1]
> > > > DirectMemoryUtils.java - Direct Memory Allocation: Allocation = 1048576,
> > > > Allocated = 0, MaxDirectMemorySize = 954466304, Remaining = 954466304
> > > > 2012-10-02 09:38:10,381 WARN [lifecycleSupervisor-1-1] LogFile.java -
> > > > Checkpoint for file(/home/user/flume-ng/filechannel1/data/log-2) is:
> > > > 1349166469095, which is beyond the requested checkpoint time: 0.
> > > > 2012-10-02 09:38:10,381 WARN [lifecycleSupervisor-1-2] LogFile.java -
> > > > Checkpoint for file(/home/user/flume-ng/filechannel3/data/log-2) is:
> > > > 1349166991594, which is beyond the requested checkpoint time: 0.
> > > > 2012-10-02 09:41:52,144 ERROR [lifecycleSupervisor-1-2] ReplayHandler.java
-
> > > > Pending takes 32103 exist after the end of replay. Duplicate messages
will
> > > > exist in destination.
> > > > 2012-10-02 09:41:52,709 INFO [lifecycleSupervisor-1-2]
> > > > MonitoredCounterGroup.java - Component type: CHANNEL, name:
> > > > probeFileChannel3 started
> > > > 2012-10-02 09:42:31,413 WARN [lifecycleSupervisor-1-1] LogFile.java -
> > > > Checkpoint for file(/home/cluster_admin/flume-ng/filechannel1/data/log-3)
> > > > is: 1349166981020, which is beyond the requested checkpoint time: 0.
> > > > 2012-10-02 09:45:14,836 ERROR [lifecycleSupervisor-1-1] ReplayHandler.java
-
> > > > Pending takes 8409 exist after the end of replay. Duplicate messages will
> > > > exist in destination.
> > > > 2012-10-02 09:45:15,453 INFO [lifecycleSupervisor-1-1]
> > > > MonitoredCounterGroup.java - Component type: CHANNEL, name:
> > > > probeFileChannel1 started
> > > > 
> > > > 
> > > > On Tue, Oct 2, 2012 at 9:19 AM, Raymond Ng <raymondair@gmail.com (mailto:raymondair@gmail.com)>
wrote:
> > > > > 
> > > > > Hi
> > > > > 
> > > > > Could I have some advice for the following exception please, is this
> > > > > related to the "ChannelException: Cannot acquire capacity" which
I
> > > > > experience from time to time
> > > > > 
> > > > > 
> > > > > 2012-10-02 09:16:53,563 ERROR [Log-BackgroundWorker] Log.java - General
> > > > > error in checkpoint worker
> > > > > java.lang.NullPointerException
> > > > > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:738)
> > > > > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:692)
> > > > > at org.apache.flume.channel.file.Log.access$300(Log.java:57)
> > > > > at
> > > > > org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:892)
> > > > > 2012-10-02 09:16:56,317 ERROR
> > > > > [SinkRunner-PollingRunner-DefaultSinkProcessor] HDFSEventSink.java
- process
> > > > > failed
> > > > > java.lang.NullPointerException: LogFile is null for id 1
> > > > > at
> > > > > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
> > > > > at org.apache.flume.channel.file.Log.get(Log.java:316)
> > > > > at
> > > > > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373)
> > > > > at
> > > > > org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
> > > > > at
> > > > > org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91)
> > > > > at
> > > > > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383)
> > > > > at
> > > > > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> > > > > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> > > > > at java.lang.Thread.run(Thread.java:662)
> > > > > 2012-10-02 09:16:56,318 ERROR
> > > > > [SinkRunner-PollingRunner-DefaultSinkProcessor] SinkRunner.java -
Unable to
> > > > > deliver event. Exception follows.
> > > > > org.apache.flume.EventDeliveryException: java.lang.NullPointerException:
> > > > > LogFile is null for id 1
> > > > > at
> > > > > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:450)
> > > > > at
> > > > > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> > > > > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> > > > > at java.lang.Thread.run(Thread.java:662)
> > > > > Caused by: java.lang.NullPointerException: LogFile is null for id
1
> > > > > at
> > > > > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
> > > > > at org.apache.flume.channel.file.Log.get(Log.java:316)
> > > > > at
> > > > > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373)
> > > > > at
> > > > > org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
> > > > > at
> > > > > org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91)
> > > > > at
> > > > > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383)
> > > > > ... 3 more
> > > > > 2012-10-02 09:16:56,625 ERROR [Log-BackgroundWorker] Log.java - General
> > > > > error in checkpoint worker
> > > > > java.lang.NullPointerException
> > > > > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:738)
> > > > > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:692)
> > > > > at org.apache.flume.channel.file.Log.access$300(Log.java:57)
> > > > > at
> > > > > org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:892)
> > > > > 2012-10-02 09:16:59,678 ERROR [Log-BackgroundWorker] Log.java - General
> > > > > error in checkpoint worker
> > > > > java.lang.NullPointerException
> > > > > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:738)
> > > > > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:692)
> > > > > at org.apache.flume.channel.file.Log.access$300(Log.java:57)
> > > > > at
> > > > > org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:892)
> > > > > 2012-10-02 09:17:01,318 ERROR
> > > > > [SinkRunner-PollingRunner-DefaultSinkProcessor] HDFSEventSink.java
- process
> > > > > failed
> > > > > java.lang.NullPointerException: LogFile is null for id 1
> > > > > at
> > > > > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
> > > > > at org.apache.flume.channel.file.Log.get(Log.java:316)
> > > > > at
> > > > > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373)
> > > > > at
> > > > > org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
> > > > > at
> > > > > org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91)
> > > > > at
> > > > > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383)
> > > > > at
> > > > > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> > > > > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> > > > > at java.lang.Thread.run(Thread.java:662)
> > > > > 2012-10-02 09:17:01,318 ERROR
> > > > > [SinkRunner-PollingRunner-DefaultSinkProcessor] SinkRunner.java -
Unable to
> > > > > deliver event. Exception follows.
> > > > > org.apache.flume.EventDeliveryException: java.lang.NullPointerException:
> > > > > LogFile is null for id 1
> > > > > at
> > > > > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:450)
> > > > > at
> > > > > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> > > > > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> > > > > at java.lang.Thread.run(Thread.java:662)
> > > > > Caused by: java.lang.NullPointerException: LogFile is null for id
1
> > > > > at
> > > > > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
> > > > > at org.apache.flume.channel.file.Log.get(Log.java:316)
> > > > > at
> > > > > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373)
> > > > > at
> > > > > org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
> > > > > at
> > > > > org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91)
> > > > > at
> > > > > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383)
> > > > > ... 3 more
> > > > > 
> > > > > 
> > > > > 
> > > > > --
> > > > > Rgds
> > > > > Ray
> > > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > --
> > > > Rgds
> > > > Ray
> > > > 
> > > 
> > > 
> > > 
> > > 
> > > --
> > > Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
> > > 
> > 
> > 
> > 
> > 
> > --
> > Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
> > 
> 
> 
> 
> 
> -- 
> Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
> 
> 



Mime
View raw message