flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: Flume startup takes ~ hour
Date Mon, 23 Sep 2013 17:50:06 GMT
How many events does the File Channel get every 30 seconds and how many get taken out? This
is one of the edge cases of the File Channel I have been working on ironing out. There is
a patch on https://issues.apache.org/jira/browse/FLUME-2155 (the FLUME-2155-initial.patch
file). If you have data that takes an hour to start, and don't mind testing out this patch
(this might be buggy, cause data loss, hangs etc - so testing in prod is not recommended),
apply this patch to trunk and test it out, and see if it improves the startup time.


On Monday, September 23, 2013 at 9:16 AM, Anat Rozenzon wrote:

> Hi,
> I have a flume instance that is collecting logs from several flume agents using avro
source and file channel.
> Recently, when I'm restarting the collector it takes about an hour to start listening
on the avro port.
> PSB a jstack entry, any idea why the startup is slow?
> Thanks
> Anat
> "lifecycleSupervisor-1-0" prio=10 tid=0x00007f01505e4800 nid=0x4c78 runnable [0x00007f01441d6000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.flume.channel.file.FlumeEventQueue.get(FlumeEventQueue.java:225)
>         at org.apache.flume.channel.file.FlumeEventQueue.remove(FlumeEventQueue.java:195)
>         - locked <0x0000000689149c30> (a org.apache.flume.channel.file.FlumeEventQueue)
>         at org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:405)
>         at org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:328)
>         at org.apache.flume.channel.file.Log.doReplay(Log.java:503)
>         at org.apache.flume.channel.file.Log.replay(Log.java:430)
>         at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:302)
>         - locked <0x0000000689145ca8> (a org.apache.flume.channel.file.FileChannel)
>         at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
>         - locked <0x0000000689145ca8> (a org.apache.flume.channel.file.FileChannel)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:724)

View raw message