flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Wendell <pwend...@gmail.com>
Subject Re: .SpoolingFileLineReader warning....
Date Mon, 19 Nov 2012 23:04:55 GMT
The spooling source gets a directory listing, then reads each file, then
renames it to X.COMPLETED. Is it possible some other process deleted that
file between when Flume listed the directory and when it tried to open the
file? Otherwise, I'm confused why the file would not be present in the
listing you give here.


On Mon, Nov 19, 2012 at 6:03 PM, Patrick Wendell <pwendell@gmail.com> wrote:

> Hey Dan,
>
> You say that it seems like Flume has already processed the log... why do
> you think that?
>
> When you listed the directory contents I don't see the original or the
> COMPLETED version of the file that Flume is complaining about:
>
> /clickstream.log-2012-11-17-1353163623
>
> doesn't appear in the
>
> /mnt/flume/clickstream/
>
> directory listing anywhere.
>
>
> On Mon, Nov 19, 2012 at 2:33 PM, Dan Young <danoyoung@gmail.com> wrote:
>
>> Hello Brock,
>>
>> It seems like we get this message each time that logrotate runs and is in
>> the process of copying the file to the SpoolingDirectory. It seems that
>> Flume starts reading the file as soon as it shows up in the
>> SpoolingDirectory.....  Maybe it's trying to read the file while it's still
>> being written to????
>>
>> 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN -
>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)]
>> Could not find file:
>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239
>> java.io.FileNotFoundException:
>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 (Permission
>> denied)
>> at java.io.FileInputStream.open(Native Method)
>>  at java.io.FileInputStream.<init>(FileInputStream.java:138)
>> at java.io.FileReader.<init>(FileReader.java:72)
>>  at
>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322)
>> at
>> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172)
>>  at
>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>  at
>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>>  at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>  at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>  at java.lang.Thread.run(Thread.java:722)
>>
>>
>>
>>
>> On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland <brock@cloudera.com> wrote:
>>
>>>  Ok, do you mind sharing your log rotate config to see if we can
>>> reproduce?
>>>
>>> --
>>> Brock Noland
>>> Sent with Sparrow <http://www.sparrowmailapp.com/?sig>
>>>
>>> On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote:
>>>
>>> Hey Brock,
>>>
>>> No I have not modified the conf while the agent was running.
>>>
>>> /mnt/flume is local. Note that this is running on an ec2 instance and
>>> the disk is the ephemeral drive, not EBS.
>>>
>>> Regards ,
>>>
>>> Dano
>>> On Nov 17, 2012 8:58 AM, "Brock Noland" <brock@cloudera.com> wrote:
>>>
>>> Hi,
>>>
>>> I highly doubt it's related to
>>> (https://issues.apache.org/jira/browse/FLUME-1721) but have you
>>> modified the configuration file since starting the agent?  If so, can
>>> you restart the agent and see if the error continues?
>>>
>>> Also, is /mnt/flume local disk or NAS?
>>>
>>> Brock
>>>
>>> On Sat, Nov 17, 2012 at 9:02 AM, Dan Young <danoyoung@gmail.com> wrote:
>>> > First a bit of context, I'm using logrotate to monitor and copy (cp
>>> -p) log
>>> > files to a flume spooling directory source.  So every hour, logrotate
>>> checks
>>> > for and copies a file from the source to the flume destination. I see
>>> the
>>> > following warning message in the flume logs.
>>> >
>>> >
>>> > 17 Nov 2012 14:47:07,682 WARN  [pool-10-thread-1]
>>> > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328)
>>>  -
>>> > Could not find file:
>>> > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623
>>> > java.io.FileNotFoundException:
>>> > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623
>>> (Permission
>>> > denied)
>>> > at java.io.FileInputStream.open(Native Method)
>>> > at java.io.FileInputStream.<init>(FileInputStream.java:138)
>>> > at java.io.FileReader.<init>(FileReader.java:72)
>>> > at
>>> >
>>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322)
>>> > at
>>> >
>>> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172)
>>> > at
>>> >
>>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
>>> > at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>> > at
>>> >
>>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>>> > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>>> > at
>>> >
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>> > at
>>> >
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>> > at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>> > at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>> > at java.lang.Thread.run(Thread.java:722)
>>> >
>>> >
>>> > Although it appears that Flume processes the log, I'm curious why I''m
>>> > seeing this and if I have anything with permissions incorrect?
>>> >
>>> >
>>> >
>>> > Here's the permissions:
>>> >
>>> > source log directory under /var/log:
>>> > drwxrwxr-x 2 ubuntu    ubuntu   4096 Nov 17 14:47 clickstream
>>> >
>>> > source files:
>>> > -rw-rw-r-- 1 ubuntu ubuntu   9055750 Nov 17 13:29
>>> > clickstream.log-2012-11-17-1353158953.gz
>>> > -rw-rw-r-- 1 ubuntu ubuntu  13583565 Nov 17 14:17
>>> > clickstream.log-2012-11-17-1353161821.gz
>>> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47
>>> > clickstream.log-2012-11-17-1353163623
>>> > -rw-rw-r-- 1 ubuntu ubuntu  65648336 Nov 17 14:52 clickstream.log
>>> >
>>> > flume source directory under /mnt/flume:
>>> > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:48 clickstream
>>> >
>>> > flume source files:
>>> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 13:29
>>> > clickstream.log-2012-11-17-1353158953.COMPLETED
>>> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 17 14:17
>>> > clickstream.log-2012-11-17-1353161821.COMPLETED
>>> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47
>>> > clickstream.log-2012-11-17-1353163623.COMPLETED
>>> >
>>> > Any insight would be appreciated.
>>> >
>>> > Regards,
>>> >
>>> > Dan
>>>
>>>
>>>
>>> --
>>> Apache MRUnit - Unit testing MapReduce -
>>> http://incubator.apache.org/mrunit/
>>>
>>>
>>>
>>
>

Mime
View raw message