Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2A15EE43C for ; Tue, 20 Nov 2012 20:03:54 +0000 (UTC) Received: (qmail 66504 invoked by uid 500); 20 Nov 2012 20:03:53 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 66467 invoked by uid 500); 20 Nov 2012 20:03:53 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 66458 invoked by uid 99); 20 Nov 2012 20:03:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Nov 2012 20:03:53 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of danoyoung@gmail.com designates 209.85.210.179 as permitted sender) Received: from [209.85.210.179] (HELO mail-ia0-f179.google.com) (209.85.210.179) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Nov 2012 20:03:46 +0000 Received: by mail-ia0-f179.google.com with SMTP id o25so4153430iad.38 for ; Tue, 20 Nov 2012 12:03:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=t3CVhd0AD2uEpFiYNHTpq0uLeRZ3ieFKtR66BjoG1xI=; b=SYe7TlWvwEdqbbKqgQ6+750IaCPMKjaIfRbNCPOutvW8bim8vlbV/FMeJLhDGYXV2z eDzU1cldUQmUFVzzme+5iCtvGqynmUjLqF/OMU/qPaRlAMpwNZ6Sz2uSZfIf1ORXiKxO of4rpJu9sJF2SeoF7uQ1ejCD3u8FxirLjWd6c4hdkvc7t0uz267WVNy8qli14UFvVsUX 6XQCO+ttC/5nwGzzfNE2cc/nRDJRcWr9m6kUdjNwLjGlTZYqkb/W1c0EK4OW0Vm2WThB Z1WdzP7/kLBWxgvZ/TUeTA8xlxPn7QgX1hFJfAtdm0adZEep+IBKjIxaOwadI25uFMG/ 74Yg== MIME-Version: 1.0 Received: by 10.50.7.234 with SMTP id m10mr11280445iga.43.1353441804646; Tue, 20 Nov 2012 12:03:24 -0800 (PST) Received: by 10.231.71.77 with HTTP; Tue, 20 Nov 2012 12:03:24 -0800 (PST) In-Reply-To: References: <06D4A63D32974ED88E73BF209A246D8D@cloudera.com> Date: Tue, 20 Nov 2012 13:03:24 -0700 Message-ID: Subject: Re: .SpoolingFileLineReader warning.... From: Dan Young To: user@flume.apache.org Content-Type: multipart/alternative; boundary=f46d04462dd4fd78fe04cef2b935 X-Virus-Checked: Checked by ClamAV on apache.org --f46d04462dd4fd78fe04cef2b935 Content-Type: text/plain; charset=ISO-8859-1 Just a quick followup on this....I moved all the logging, etc to the same drive/device and I'm no longer seeing those permission errors!. I'm going to keep testing this week. Thank you Brock. Regards, Dano On Tue, Nov 20, 2012 at 10:14 AM, Brock Noland wrote: > OK yeah, so basically it's doing a copy and then a delete. See "man 2 > rename" then ERRORS -> EXDEV but basically atomically renaming a file > only works on the same mount point. > > On Tue, Nov 20, 2012 at 11:10 AM, Dan Young wrote: > > This is running on an EC2 instance, and these are the ephemeral drive(s). > > > > Filesystem 1K-blocks Used Available Use% Mounted on > > /dev/xvda1 8256952 1618248 6219276 21% / > > udev 1912272 8 1912264 1% /dev > > tmpfs 768100 180 767920 1% /run > > none 5120 0 5120 0% /run/lock > > none 1920244 0 1920244 0% /run/shm > > /dev/xvdb 412814512 1127172 390717564 1% /mnt > > > > ubuntu@ip-foo:~$ cat /etc/fstab > > LABEL=cloudimg-rootfs / ext4 defaults 0 0 > > /dev/xvdb /mnt auto defaults,nobootwait,comment=cloudconfig 0 2 > > ubuntu@ip-foo:~$ > > > > ubuntu@ip-foo:~$ mount > > /dev/xvda1 on / type ext4 (rw) > > proc on /proc type proc (rw,noexec,nosuid,nodev) > > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) > > none on /sys/fs/fuse/connections type fusectl (rw) > > none on /sys/kernel/debug type debugfs (rw) > > none on /sys/kernel/security type securityfs (rw) > > udev on /dev type devtmpfs (rw,mode=0755) > > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) > > tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) > > none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) > > none on /run/shm type tmpfs (rw,nosuid,nodev) > > /dev/xvdb on /mnt type ext3 (rw) > > > > I do see that / is ext4 and /mnt is ext3.....hmm, not sure it matters. > > > > Dano > > > > > > > > On Tue, Nov 20, 2012 at 10:01 AM, Brock Noland > wrote: > >> > >> Are /var/log and /mnt/flume on the same file system? mv across file > >> systems is a cp and then delete. > >> > >> On Tue, Nov 20, 2012 at 10:59 AM, Dan Young > wrote: > >> > Coolio, thank you Brock. > >> > > >> > Did a quick test with a mv vs cp.....this is one test, repeated a few > >> > times....the smaller file (~60M) seemed to work fine w/ a mv, but a > >> > larger > >> > file we're seeing the same behavior.... > >> > > >> > > >> > Start with no logs in the SpoolingDirectory: > >> > > >> > ls -lrt /mnt/flume/clickstream > >> > > >> > /mnt/flume/clickstream: > >> > total 0 > >> > > >> > > >> > > >> > Review the logs that will be rotated via logrotate.d into the > respective > >> > directory; /mnt/flume/clickstream > >> > > >> > ls -lrt /var/log/clickstream > >> > /var/log/clickstream: > >> > total 64112 > >> > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.log > >> > > >> > > >> > > >> > Review logrotate config in /etc/logrotate.d. Note here, I changed from > >> > cp -p > >> > to a mv..... > >> > > >> > /var/log/clickstream/clickstream.log > >> > { > >> > missingok > >> > rotate 3 > >> > compress > >> > delaycompress > >> > copytruncate > >> > notifempty > >> > size 50M > >> > dateext > >> > dateformat -%Y-%m-%d-%s > >> > create 666 ubuntu ubuntu > >> > postrotate > >> > mv $1 /mnt/flume/clickstream/ 2>&1 > >> > endscript > >> > } > >> > > >> > > >> > I run logrotate.d/clickstream.POST with the ~60MB file, and everything > >> > looked fine....now I try a ~190MB file.... > >> > > >> > > >> > > >> > ls -lrt /var/log/clickstream > >> > /var/log/clickstream: > >> > total 192336 > >> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:42 clickstream.log > >> > > >> > > >> > Run logrotate.d/clickstream.POST, and we see the WARNING in the FLume > >> > log. > >> > > >> > .... > >> > .... > >> > 20 Nov 2012 16:45:07,117 WARN [pool-13-thread-1] > >> > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) > - > >> > Could not find file: > >> > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 > >> > java.io.FileNotFoundException: > >> > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 > (Permission > >> > denied) > >> > at java.io.FileInputStream.open(Native Method) > >> > at java.io.FileInputStream.(FileInputStream.java:138) > >> > at java.io.FileReader.(FileReader.java:72) > >> > at > >> > > >> > > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > >> > at > >> > > >> > > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > >> > at > >> > > >> > > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > >> > at > >> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > >> > at > >> > > >> > > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > >> > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > >> > at > >> > > >> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > >> > at > >> > > >> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > >> > at > >> > > >> > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > >> > at > >> > > >> > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > >> > at java.lang.Thread.run(Thread.java:722) > >> > 20 Nov 2012 16:45:13,174 INFO [hdfs-c1s3-call-runner-5] > >> > (org.apache.flume.sink.hdfs.BucketWriter.doOpen:205) - Creating > >> > > >> > s3n://X:Y@my-bucket > /clicks/2012/11/clicks-2012-11-20-16-40-10.145.184.200-.1353429913104.gz.tmp > >> > .... > >> > .... > >> > > >> > But it seems that FLume did process the log..... > >> > > >> > ls -lrt /mnt/flume/clickstream/ > >> > total 256704 > >> > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:25 > >> > clickstream.log-2012-11-20-1353428715.COMPLETED > >> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:45 > >> > clickstream.log-2012-11-20-1353429906.COMPLETED > >> > > >> > > >> > Regards, > >> > > >> > Dano > >> > > >> > > >> > > >> > > >> > > >> > On Tue, Nov 20, 2012 at 9:21 AM, Brock Noland > >> > wrote: > >> >> > >> >> Yeah I think that makes sense, I have created a JIRA for this > >> >> > >> >> https://issues.apache.org/jira/browse/FLUME-1733 > >> >> > >> >> Brock > >> >> > >> >> On Tue, Nov 20, 2012 at 9:02 AM, Dan Young > wrote: > >> >> > Hey Brock, > >> >> > > >> >> > I can do some more testing on my side with smaller files as well as > >> >> > doing a > >> >> > mv vs a cp . I do believe that a slight delay would be helpful > since > >> >> > people > >> >> > will be moving/copying large files around. > >> >> > > >> >> > Regards , > >> >> > > >> >> > Dano > >> >> > > >> >> > On Nov 20, 2012 5:26 AM, "Brock Noland" > wrote: > >> >> >> > >> >> >> Thinking about this more, I think it's probably going to be quite > >> >> >> common for people to cp large files into the spooling directory. > >> >> >> Patrick, what do you think about waiting until the mtime is say 1 > >> >> >> second old? > >> >> >> > >> >> >> Brock > >> >> >> > >> >> >> On Mon, Nov 19, 2012 at 5:29 PM, Brock Noland > > >> >> >> wrote: > >> >> >> > My guess is that the file does not have the correct permissions > >> >> >> > while > >> >> >> > being copied. > >> >> >> > > >> >> >> > [noland@localhost cp-test]$ cp -p test-0 test-1 & sleep 0.1; ls > >> >> >> > -al > >> >> >> > test* > >> >> >> > [1] 18780 > >> >> >> > -rw-rw-r-- 1 noland noland 1048576000 Nov 19 17:25 test-0 > >> >> >> > -rw------- 1 noland noland 52334592 Nov 19 17:27 test-1 > >> >> >> > > >> >> >> > > >> >> >> > For large files, it probably makes sense to copy the file in as > >> >> >> > .file > >> >> >> > and then rename it to file. > >> >> >> > > >> >> >> > Brock > >> >> >> > > >> >> >> > On Mon, Nov 19, 2012 at 5:04 PM, Patrick Wendell > >> >> >> > > >> >> >> > wrote: > >> >> >> >> The spooling source gets a directory listing, then reads each > >> >> >> >> file, > >> >> >> >> then > >> >> >> >> renames it to X.COMPLETED. Is it possible some other process > >> >> >> >> deleted > >> >> >> >> that > >> >> >> >> file between when Flume listed the directory and when it tried > to > >> >> >> >> open > >> >> >> >> the > >> >> >> >> file? Otherwise, I'm confused why the file would not be present > >> >> >> >> in > >> >> >> >> the > >> >> >> >> listing you give here. > >> >> >> >> > >> >> >> >> > >> >> >> >> On Mon, Nov 19, 2012 at 6:03 PM, Patrick Wendell > >> >> >> >> > >> >> >> >> wrote: > >> >> >> >>> > >> >> >> >>> Hey Dan, > >> >> >> >>> > >> >> >> >>> You say that it seems like Flume has already processed the > >> >> >> >>> log... > >> >> >> >>> why > >> >> >> >>> do > >> >> >> >>> you think that? > >> >> >> >>> > >> >> >> >>> When you listed the directory contents I don't see the > original > >> >> >> >>> or > >> >> >> >>> the > >> >> >> >>> COMPLETED version of the file that Flume is complaining about: > >> >> >> >>> > >> >> >> >>> /clickstream.log-2012-11-17-1353163623 > >> >> >> >>> > >> >> >> >>> doesn't appear in the > >> >> >> >>> > >> >> >> >>> /mnt/flume/clickstream/ > >> >> >> >>> > >> >> >> >>> directory listing anywhere. > >> >> >> >>> > >> >> >> >>> > >> >> >> >>> On Mon, Nov 19, 2012 at 2:33 PM, Dan Young < > danoyoung@gmail.com> > >> >> >> >>> wrote: > >> >> >> >>>> > >> >> >> >>>> Hello Brock, > >> >> >> >>>> > >> >> >> >>>> It seems like we get this message each time that logrotate > runs > >> >> >> >>>> and > >> >> >> >>>> is in > >> >> >> >>>> the process of copying the file to the SpoolingDirectory. It > >> >> >> >>>> seems > >> >> >> >>>> that > >> >> >> >>>> Flume starts reading the file as soon as it shows up in the > >> >> >> >>>> SpoolingDirectory..... Maybe it's trying to read the file > >> >> >> >>>> while > >> >> >> >>>> it's > >> >> >> >>>> still > >> >> >> >>>> being written to???? > >> >> >> >>>> > >> >> >> >>>> 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN - > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)] > >> >> >> >>>> Could not find file: > >> >> >> >>>> > /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 > >> >> >> >>>> java.io.FileNotFoundException: > >> >> >> >>>> > /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 > >> >> >> >>>> (Permission > >> >> >> >>>> denied) > >> >> >> >>>> at java.io.FileInputStream.open(Native Method) > >> >> >> >>>> at java.io.FileInputStream.(FileInputStream.java:138) > >> >> >> >>>> at java.io.FileReader.(FileReader.java:72) > >> >> >> >>>> at > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > >> >> >> >>>> at > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > >> >> >> >>>> at > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > >> >> >> >>>> at > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > >> >> >> >>>> at > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > >> >> >> >>>> at > >> >> >> >>>> > >> >> >> >>>> > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > >> >> >> >>>> at > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > >> >> >> >>>> at > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > >> >> >> >>>> at > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > >> >> >> >>>> at > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > >> >> >> >>>> at java.lang.Thread.run(Thread.java:722) > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland > >> >> >> >>>> > >> >> >> >>>> wrote: > >> >> >> >>>>> > >> >> >> >>>>> Ok, do you mind sharing your log rotate config to see if we > >> >> >> >>>>> can > >> >> >> >>>>> reproduce? > >> >> >> >>>>> > >> >> >> >>>>> -- > >> >> >> >>>>> Brock Noland > >> >> >> >>>>> Sent with Sparrow > >> >> >> >>>>> > >> >> >> >>>>> On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote: > >> >> >> >>>>> > >> >> >> >>>>> Hey Brock, > >> >> >> >>>>> > >> >> >> >>>>> No I have not modified the conf while the agent was running. > >> >> >> >>>>> > >> >> >> >>>>> /mnt/flume is local. Note that this is running on an ec2 > >> >> >> >>>>> instance > >> >> >> >>>>> and > >> >> >> >>>>> the disk is the ephemeral drive, not EBS. > >> >> >> >>>>> > >> >> >> >>>>> Regards , > >> >> >> >>>>> > >> >> >> >>>>> Dano > >> >> >> >>>>> > >> >> >> >>>>> On Nov 17, 2012 8:58 AM, "Brock Noland" > > >> >> >> >>>>> wrote: > >> >> >> >>>>> > >> >> >> >>>>> Hi, > >> >> >> >>>>> > >> >> >> >>>>> I highly doubt it's related to > >> >> >> >>>>> (https://issues.apache.org/jira/browse/FLUME-1721) but have > >> >> >> >>>>> you > >> >> >> >>>>> modified the configuration file since starting the agent? > If > >> >> >> >>>>> so, > >> >> >> >>>>> can > >> >> >> >>>>> you restart the agent and see if the error continues? > >> >> >> >>>>> > >> >> >> >>>>> Also, is /mnt/flume local disk or NAS? > >> >> >> >>>>> > >> >> >> >>>>> Brock > >> >> >> >>>>> > >> >> >> >>>>> On Sat, Nov 17, 2012 at 9:02 AM, Dan Young > >> >> >> >>>>> > >> >> >> >>>>> wrote: > >> >> >> >>>>> > First a bit of context, I'm using logrotate to monitor and > >> >> >> >>>>> > copy > >> >> >> >>>>> > (cp > >> >> >> >>>>> > -p) log > >> >> >> >>>>> > files to a flume spooling directory source. So every > hour, > >> >> >> >>>>> > logrotate > >> >> >> >>>>> > checks > >> >> >> >>>>> > for and copies a file from the source to the flume > >> >> >> >>>>> > destination. > >> >> >> >>>>> > I > >> >> >> >>>>> > see > >> >> >> >>>>> > the > >> >> >> >>>>> > following warning message in the flume logs. > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > 17 Nov 2012 14:47:07,682 WARN [pool-10-thread-1] > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) > >> >> >> >>>>> > - > >> >> >> >>>>> > Could not find file: > >> >> >> >>>>> > > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 > >> >> >> >>>>> > java.io.FileNotFoundException: > >> >> >> >>>>> > > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 > >> >> >> >>>>> > (Permission > >> >> >> >>>>> > denied) > >> >> >> >>>>> > at java.io.FileInputStream.open(Native Method) > >> >> >> >>>>> > at > java.io.FileInputStream.(FileInputStream.java:138) > >> >> >> >>>>> > at java.io.FileReader.(FileReader.java:72) > >> >> >> >>>>> > at > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > >> >> >> >>>>> > at > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > >> >> >> >>>>> > at > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > >> >> >> >>>>> > at > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > >> >> >> >>>>> > at > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > >> >> >> >>>>> > at > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > >> >> >> >>>>> > at > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > >> >> >> >>>>> > at > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > >> >> >> >>>>> > at > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > >> >> >> >>>>> > at > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > >> >> >> >>>>> > at java.lang.Thread.run(Thread.java:722) > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > Although it appears that Flume processes the log, I'm > >> >> >> >>>>> > curious > >> >> >> >>>>> > why > >> >> >> >>>>> > I''m > >> >> >> >>>>> > seeing this and if I have anything with permissions > >> >> >> >>>>> > incorrect? > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > > >> >> >> >>>>> > Here's the permissions: > >> >> >> >>>>> > > >> >> >> >>>>> > source log directory under /var/log: > >> >> >> >>>>> > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:47 > >> >> >> >>>>> > clickstream > >> >> >> >>>>> > > >> >> >> >>>>> > source files: > >> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 9055750 Nov 17 13:29 > >> >> >> >>>>> > clickstream.log-2012-11-17-1353158953.gz > >> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 13583565 Nov 17 14:17 > >> >> >> >>>>> > clickstream.log-2012-11-17-1353161821.gz > >> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47 > >> >> >> >>>>> > clickstream.log-2012-11-17-1353163623 > >> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 17 14:52 > >> >> >> >>>>> > clickstream.log > >> >> >> >>>>> > > >> >> >> >>>>> > flume source directory under /mnt/flume: > >> >> >> >>>>> > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:48 clickstream > >> >> >> >>>>> > > >> >> >> >>>>> > flume source files: > >> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 13:29 > >> >> >> >>>>> > clickstream.log-2012-11-17-1353158953.COMPLETED > >> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 17 14:17 > >> >> >> >>>>> > clickstream.log-2012-11-17-1353161821.COMPLETED > >> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47 > >> >> >> >>>>> > clickstream.log-2012-11-17-1353163623.COMPLETED > >> >> >> >>>>> > > >> >> >> >>>>> > Any insight would be appreciated. > >> >> >> >>>>> > > >> >> >> >>>>> > Regards, > >> >> >> >>>>> > > >> >> >> >>>>> > Dan > >> >> >> >>>>> > >> >> >> >>>>> > >> >> >> >>>>> > >> >> >> >>>>> -- > >> >> >> >>>>> Apache MRUnit - Unit testing MapReduce - > >> >> >> >>>>> http://incubator.apache.org/mrunit/ > >> >> >> >>>>> > >> >> >> >>>>> > >> >> >> >>>> > >> >> >> >>> > >> >> >> >> > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > -- > >> >> >> > Apache MRUnit - Unit testing MapReduce - > >> >> >> > http://incubator.apache.org/mrunit/ > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Apache MRUnit - Unit testing MapReduce - > >> >> >> http://incubator.apache.org/mrunit/ > >> >> > >> >> > >> >> > >> >> -- > >> >> Apache MRUnit - Unit testing MapReduce - > >> >> http://incubator.apache.org/mrunit/ > >> > > >> > > >> > >> > >> > >> -- > >> Apache MRUnit - Unit testing MapReduce - > >> http://incubator.apache.org/mrunit/ > > > > > > > > -- > Apache MRUnit - Unit testing MapReduce - > http://incubator.apache.org/mrunit/ > --f46d04462dd4fd78fe04cef2b935 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Just a quick followup on this....I moved all the logging, etc to the same d= rive/device and I'm no longer seeing those permission errors!. =A0 I= 9;m going to keep testing this week.

Thank you Brock.

Regards,

Dano


On Tue, Nov 20, 2012 at 10:14 AM, B= rock Noland <brock@cloudera.com> wrote:
OK yeah, so basically it's doing a copy = and then a delete. See "man 2
rename" then ERRORS -> EXDEV but basically atomically renaming a fi= le
only works on the same mount point.

On Tue, Nov 20, 2012 at 11:10 AM, Dan Young <danoyoung@gmail.com> wrote:
> This is running on an EC2 instance, and these are the ephemeral drive(= s).
>
> Filesystem =A0 =A0 1K-blocks =A0 =A0Used Available Use% Mounted on
> /dev/xvda1 =A0 =A0 =A0 8256952 1618248 =A0 6219276 =A021% /
> udev =A0 =A0 =A0 =A0 =A0 =A0 1912272 =A0 =A0 =A0 8 =A0 1912264 =A0 1% = /dev
> tmpfs =A0 =A0 =A0 =A0 =A0 =A0 768100 =A0 =A0 180 =A0 =A0767920 =A0 1% = /run
> none =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A05120 =A0 =A0 =A0 0 =A0 =A0 =A05120= =A0 0% /run/lock
> none =A0 =A0 =A0 =A0 =A0 =A0 1920244 =A0 =A0 =A0 0 =A0 1920244 =A0 0% = /run/shm
> /dev/xvdb =A0 =A0 =A0412814512 1127172 390717564 =A0 1% /mnt
>
> ubuntu@ip-foo:~$ cat /etc/fstab
> LABEL=3Dcloudimg-rootfs / ext4 defaults 0 0
> /dev/xvdb /mnt auto defaults,nobootwait,comment=3Dcloudconfig 0 2
> ubuntu@ip-foo:~$
>
> ubuntu@ip-foo:~$ mount
> /dev/xvda1 on / type ext4 (rw)
> proc on /proc type proc (rw,noexec,nosuid,nodev)
> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
> none on /sys/fs/fuse/connections type fusectl (rw)
> none on /sys/kernel/debug type debugfs (rw)
> none on /sys/kernel/security type securityfs (rw)
> udev on /dev type devtmpfs (rw,mode=3D0755)
> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=3D5,mode=3D0620)<= br> > tmpfs on /run type tmpfs (rw,noexec,nosuid,size=3D10%,mode=3D0755)
> none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=3D5242880) > none on /run/shm type tmpfs (rw,nosuid,nodev)
> /dev/xvdb on /mnt type ext3 (rw)
>
> I do see that / is ext4 and /mnt is ext3.....hmm, not sure it matters.=
>
> Dano
>
>
>
> On Tue, Nov 20, 2012 at 10:01 AM, Brock Noland <brock@cloudera.com> wrote:
>>
>> Are /var/log and /mnt/flume on the same file system? mv across fil= e
>> systems is a cp and then delete.
>>
>> On Tue, Nov 20, 2012 at 10:59 AM, Dan Young <danoyoung@gmail.com> wrote:
>> > Coolio, thank you Brock.
>> >
>> > Did a quick test with a mv vs cp.....this is one test, repeat= ed a few
>> > times....the smaller file (~60M) seemed to work fine w/ a mv,= but a
>> > larger
>> > file we're seeing the same behavior....
>> >
>> >
>> > Start with no logs in the SpoolingDirectory:
>> >
>> > ls -lrt /mnt/flume/clickstream
>> >
>> > /mnt/flume/clickstream:
>> > total 0
>> >
>> >
>> >
>> > Review the logs that will be rotated via logrotate.d into the= respective
>> > directory; /mnt/flume/clickstream
>> >
>> > ls -lrt /var/log/clickstream
>> > /var/log/clickstream:
>> > total 64112
>> > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.= log
>> >
>> >
>> >
>> > Review logrotate config in /etc/logrotate.d. Note here, I cha= nged from
>> > cp -p
>> > to a mv.....
>> >
>> > /var/log/clickstream/clickstream.log
>> > {
>> > =A0 missingok
>> > =A0 rotate 3
>> > =A0 compress
>> > =A0 delaycompress
>> > =A0 copytruncate
>> > =A0 notifempty
>> > =A0 size 50M
>> > =A0 dateext
>> > =A0 dateformat -%Y-%m-%d-%s
>> > =A0 create 666 ubuntu ubuntu
>> > =A0 postrotate
>> > =A0 mv $1 /mnt/flume/clickstream/ 2>&1
>> > =A0 endscript
>> > }
>> >
>> >
>> > I run logrotate.d/clickstream.POST with the ~60MB file, and e= verything
>> > looked fine....now I try a ~190MB file....
>> >
>> >
>> >
>> > ls -lrt /var/log/clickstream
>> > /var/log/clickstream:
>> > total 192336
>> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:42 clickstream= .log
>> >
>> >
>> > Run logrotate.d/clickstream.POST, and we see the WARNING in t= he FLume
>> > log.
>> >
>> > ....
>> > ....
>> > 20 Nov 2012 16:45:07,117 WARN =A0[pool-13-thread-1]
>> > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextF= ile:328) =A0-
>> > Could not find file:
>> > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906<= br> >> > java.io.FileNotFoundException:
>> > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 = (Permission
>> > denied)
>> > at java.io.FileInputStream.open(Native Method)
>> > at java.io.FileInputStream.<init>(FileInputStream.java:= 138)
>> > at java.io.FileReader.<init>(FileReader.java:72)
>> > at
>> >
>> > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFi= le(SpoolingFileLineReader.java:322)
>> > at
>> >
>> > org.apache.flume.client.avro.SpoolingFileLineReader.readLines= (SpoolingFileLineReader.java:172)
>> > at
>> >
>> > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRu= nnable.run(SpoolDirectorySource.java:135)
>> > at
>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors= .java:471)
>> > at
>> >
>> > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureT= ask.java:351)
>> > at java.util.concurrent.FutureTask.runAndReset(FutureTask.jav= a:178)
>> > at
>> >
>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFut= ureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>> > at
>> >
>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFut= ureTask.run(ScheduledThreadPoolExecutor.java:293)
>> > at
>> >
>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolE= xecutor.java:1110)
>> > at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPool= Executor.java:603)
>> > at java.lang.Thread.run(Thread.java:722)
>> > 20 Nov 2012 16:45:13,174 INFO =A0[hdfs-c1s3-call-runner-5] >> > (org.apache.flume.sink.hdfs.BucketWriter.doOpen:205) =A0- Cre= ating
>> >
>> > s3n://X:Y@my-bucket/clicks/2012/11/clicks-2012-11-20-16-40-10= .145.184.200-.1353429913104.gz.tmp
>> > ....
>> > ....
>> >
>> > But it seems that FLume did process the log.....
>> >
>> > ls -lrt /mnt/flume/clickstream/
>> > total 256704
>> > -rw-rw-r-- 1 ubuntu ubuntu =A065648336 Nov 20 16:25
>> > clickstream.log-2012-11-20-1353428715.COMPLETED
>> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:45
>> > clickstream.log-2012-11-20-1353429906.COMPLETED
>> >
>> >
>> > Regards,
>> >
>> > Dano
>> >
>> >
>> >
>> >
>> >
>> > On Tue, Nov 20, 2012 at 9:21 AM, Brock Noland <brock@cloudera.com>
>> > wrote:
>> >>
>> >> Yeah I think that makes sense, I have created a JIRA for = this
>> >>
>> >> https://issues.apache.org/jira/browse/FLUME-1733<= br> >> >>
>> >> Brock
>> >>
>> >> On Tue, Nov 20, 2012 at 9:02 AM, Dan Young <danoyoung@gmail.com> wrote:
>> >> > Hey Brock,
>> >> >
>> >> > I can do some more testing on my side with smaller f= iles as well as
>> >> > doing a
>> >> > mv vs a cp . I do believe that a slight delay would = be helpful since
>> >> > people
>> >> > will be moving/copying large files around.
>> >> >
>> >> > Regards ,
>> >> >
>> >> > Dano
>> >> >
>> >> > On Nov 20, 2012 5:26 AM, "Brock Noland" &l= t;brock@cloudera.com> wrote: >> >> >>
>> >> >> Thinking about this more, I think it's proba= bly going to be quite
>> >> >> common for people to cp large files into the spo= oling directory.
>> >> >> Patrick, what do you think about waiting until t= he mtime is say 1
>> >> >> second old?
>> >> >>
>> >> >> Brock
>> >> >>
>> >> >> On Mon, Nov 19, 2012 at 5:29 PM, Brock Noland &l= t;brock@cloudera.com>
>> >> >> wrote:
>> >> >> > My guess is that the file does not have the= correct permissions
>> >> >> > while
>> >> >> > being copied.
>> >> >> >
>> >> >> > [noland@localhost cp-test]$ cp -p test-0 te= st-1 & sleep 0.1; ls
>> >> >> > -al
>> >> >> > test*
>> >> >> > [1] 18780
>> >> >> > -rw-rw-r-- 1 noland noland 1048576000 Nov 1= 9 17:25 test-0
>> >> >> > -rw------- 1 noland noland =A0 52334592 Nov= 19 17:27 test-1
>> >> >> >
>> >> >> >
>> >> >> > For large files, it probably makes sense to= copy the file in as
>> >> >> > .file
>> >> >> > and then rename it to file.
>> >> >> >
>> >> >> > Brock
>> >> >> >
>> >> >> > On Mon, Nov 19, 2012 at 5:04 PM, Patrick We= ndell
>> >> >> > <p= wendell@gmail.com>
>> >> >> > wrote:
>> >> >> >> The spooling source gets a directory li= sting, then reads each
>> >> >> >> file,
>> >> >> >> then
>> >> >> >> renames it to X.COMPLETED. Is it possib= le some other process
>> >> >> >> deleted
>> >> >> >> that
>> >> >> >> file between when Flume listed the dire= ctory and when it tried to
>> >> >> >> open
>> >> >> >> the
>> >> >> >> file? Otherwise, I'm confused why t= he file would not be present
>> >> >> >> in
>> >> >> >> the
>> >> >> >> listing you give here.
>> >> >> >>
>> >> >> >>
>> >> >> >> On Mon, Nov 19, 2012 at 6:03 PM, Patric= k Wendell
>> >> >> >> <pwendell@gmail.com>
>> >> >> >> wrote:
>> >> >> >>>
>> >> >> >>> Hey Dan,
>> >> >> >>>
>> >> >> >>> You say that it seems like Flume ha= s already processed the
>> >> >> >>> log...
>> >> >> >>> why
>> >> >> >>> do
>> >> >> >>> you think that?
>> >> >> >>>
>> >> >> >>> When you listed the directory conte= nts I don't see the original
>> >> >> >>> or
>> >> >> >>> the
>> >> >> >>> COMPLETED version of the file that = Flume is complaining about:
>> >> >> >>>
>> >> >> >>> /clickstream.log-2012-11-17-1353163= 623
>> >> >> >>>
>> >> >> >>> doesn't appear in the
>> >> >> >>>
>> >> >> >>> /mnt/flume/clickstream/
>> >> >> >>>
>> >> >> >>> directory listing anywhere.
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> On Mon, Nov 19, 2012 at 2:33 PM, Da= n Young <danoyoung@gmail.com&= gt;
>> >> >> >>> wrote:
>> >> >> >>>>
>> >> >> >>>> Hello Brock,
>> >> >> >>>>
>> >> >> >>>> It seems like we get this messa= ge each time that logrotate runs
>> >> >> >>>> and
>> >> >> >>>> is in
>> >> >> >>>> the process of copying the file= to the SpoolingDirectory. It
>> >> >> >>>> seems
>> >> >> >>>> that
>> >> >> >>>> Flume starts reading the file a= s soon as it shows up in the
>> >> >> >>>> SpoolingDirectory..... =A0Maybe= it's trying to read the file
>> >> >> >>>> while
>> >> >> >>>> it's
>> >> >> >>>> still
>> >> >> >>>> being written to????
>> >> >> >>>>
>> >> >> >>>> 2012-11-19 19:27:27,924 (pool-1= 2-thread-1) [WARN -
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> org.apache.flume.client.avro.Sp= oolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)]
>> >> >> >>>> Could not find file:
>> >> >> >>>> /mnt/flume/clickstream2/clickst= ream2.log-2012-11-19-1353353239
>> >> >> >>>> java.io.FileNotFoundException:<= br> >> >> >> >>>> /mnt/flume/clickstream2/clickst= ream2.log-2012-11-19-1353353239
>> >> >> >>>> (Permission
>> >> >> >>>> denied)
>> >> >> >>>> at java.io.FileInputStream.open= (Native Method)
>> >> >> >>>> at java.io.FileInputStream.<= init>(FileInputStream.java:138)
>> >> >> >>>> at java.io.FileReader.<init&= gt;(FileReader.java:72)
>> >> >> >>>> at
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> org.apache.flume.client.avro.Sp= oolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322)
>> >> >> >>>> at
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> org.apache.flume.client.avro.Sp= oolingFileLineReader.readLines(SpoolingFileLineReader.java:172)
>> >> >> >>>> at
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> org.apache.flume.source.SpoolDi= rectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
>> >> >> >>>> at
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> java.util.concurrent.Executors$= RunnableAdapter.call(Executors.java:471)
>> >> >> >>>> at
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> java.util.concurrent.FutureTask= $Sync.innerRunAndReset(FutureTask.java:351)
>> >> >> >>>> at
>> >> >> >>>>
>> >> >> >>>> java.util.concurrent.FutureTask= .runAndReset(FutureTask.java:178)
>> >> >> >>>> at
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> java.util.concurrent.ScheduledT= hreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecuto= r.java:178)
>> >> >> >>>> at
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> java.util.concurrent.ScheduledT= hreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:= 293)
>> >> >> >>>> at
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> java.util.concurrent.ThreadPool= Executor.runWorker(ThreadPoolExecutor.java:1110)
>> >> >> >>>> at
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> java.util.concurrent.ThreadPool= Executor$Worker.run(ThreadPoolExecutor.java:603)
>> >> >> >>>> at java.lang.Thread.run(Thread.= java:722)
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> On Sat, Nov 17, 2012 at 9:15 AM= , Brock Noland
>> >> >> >>>> <brock@cloudera.com>
>> >> >> >>>> wrote:
>> >> >> >>>>>
>> >> >> >>>>> Ok, do you mind sharing you= r log rotate config to see if we
>> >> >> >>>>> can
>> >> >> >>>>> reproduce?
>> >> >> >>>>>
>> >> >> >>>>> --
>> >> >> >>>>> Brock Noland
>> >> >> >>>>> Sent with Sparrow
>> >> >> >>>>>
>> >> >> >>>>> On Saturday, November 17, 2= 012 at 10:01 AM, Dan Young wrote:
>> >> >> >>>>>
>> >> >> >>>>> Hey Brock,
>> >> >> >>>>>
>> >> >> >>>>> No I have not modified the = conf while the agent was running.
>> >> >> >>>>>
>> >> >> >>>>> /mnt/flume is local. Note t= hat this is running on an ec2
>> >> >> >>>>> instance
>> >> >> >>>>> and
>> >> >> >>>>> the disk is the ephemeral d= rive, not EBS.
>> >> >> >>>>>
>> >> >> >>>>> Regards ,
>> >> >> >>>>>
>> >> >> >>>>> Dano
>> >> >> >>>>>
>> >> >> >>>>> On Nov 17, 2012 8:58 AM, &q= uot;Brock Noland" <brock@clou= dera.com>
>> >> >> >>>>> wrote:
>> >> >> >>>>>
>> >> >> >>>>> Hi,
>> >> >> >>>>>
>> >> >> >>>>> I highly doubt it's rel= ated to
>> >> >> >>>>> (https://issues.apache.= org/jira/browse/FLUME-1721) but have
>> >> >> >>>>> you
>> >> >> >>>>> modified the configuration = file since starting the agent? =A0If
>> >> >> >>>>> so,
>> >> >> >>>>> can
>> >> >> >>>>> you restart the agent and s= ee if the error continues?
>> >> >> >>>>>
>> >> >> >>>>> Also, is /mnt/flume local d= isk or NAS?
>> >> >> >>>>>
>> >> >> >>>>> Brock
>> >> >> >>>>>
>> >> >> >>>>> On Sat, Nov 17, 2012 at 9:0= 2 AM, Dan Young
>> >> >> >>>>> <danoyoung@gmail.com>
>> >> >> >>>>> wrote:
>> >> >> >>>>> > First a bit of context= , I'm using logrotate to monitor and
>> >> >> >>>>> > copy
>> >> >> >>>>> > (cp
>> >> >> >>>>> > -p) log
>> >> >> >>>>> > files to a flume spool= ing directory source. =A0So every hour,
>> >> >> >>>>> > logrotate
>> >> >> >>>>> > checks
>> >> >> >>>>> > for and copies a file = from the source to the flume
>> >> >> >>>>> > destination.
>> >> >> >>>>> > I
>> >> >> >>>>> > see
>> >> >> >>>>> > the
>> >> >> >>>>> > following warning mess= age in the flume logs.
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > 17 Nov 2012 14:47:07,6= 82 WARN =A0[pool-10-thread-1]
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > (org.apache.flume.clie= nt.avro.SpoolingFileLineReader.getNextFile:328)
>> >> >> >>>>> > -
>> >> >> >>>>> > Could not find file: >> >> >> >>>>> > /mnt/flume/clickstream= /clickstream.log-2012-11-17-1353163623
>> >> >> >>>>> > java.io.FileNotFoundEx= ception:
>> >> >> >>>>> > /mnt/flume/clickstream= /clickstream.log-2012-11-17-1353163623
>> >> >> >>>>> > (Permission
>> >> >> >>>>> > denied)
>> >> >> >>>>> > at java.io.FileInputSt= ream.open(Native Method)
>> >> >> >>>>> > at java.io.FileInputSt= ream.<init>(FileInputStream.java:138)
>> >> >> >>>>> > at java.io.FileReader.= <init>(FileReader.java:72)
>> >> >> >>>>> > at
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > org.apache.flume.clien= t.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322)<= br> >> >> >> >>>>> > at
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > org.apache.flume.clien= t.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) >> >> >> >>>>> > at
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > org.apache.flume.sourc= e.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java= :135)
>> >> >> >>>>> > at
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > java.util.concurrent.E= xecutors$RunnableAdapter.call(Executors.java:471)
>> >> >> >>>>> > at
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > java.util.concurrent.F= utureTask$Sync.innerRunAndReset(FutureTask.java:351)
>> >> >> >>>>> > at
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > java.util.concurrent.F= utureTask.runAndReset(FutureTask.java:178)
>> >> >> >>>>> > at
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > java.util.concurrent.S= cheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPo= olExecutor.java:178)
>> >> >> >>>>> > at
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > java.util.concurrent.S= cheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecu= tor.java:293)
>> >> >> >>>>> > at
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > java.util.concurrent.T= hreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>> >> >> >>>>> > at
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > java.util.concurrent.T= hreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>> >> >> >>>>> > at java.lang.Thread.ru= n(Thread.java:722)
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > Although it appears th= at Flume processes the log, I'm
>> >> >> >>>>> > curious
>> >> >> >>>>> > why
>> >> >> >>>>> > I''m
>> >> >> >>>>> > seeing this and if I h= ave anything with permissions
>> >> >> >>>>> > incorrect?
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > Here's the permiss= ions:
>> >> >> >>>>> >
>> >> >> >>>>> > source log directory u= nder /var/log:
>> >> >> >>>>> > drwxrwxr-x 2 ubuntu = =A0 =A0ubuntu =A0 4096 Nov 17 14:47
>> >> >> >>>>> > clickstream
>> >> >> >>>>> >
>> >> >> >>>>> > source files:
>> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ub= untu =A0 9055750 Nov 17 13:29
>> >> >> >>>>> > clickstream.log-2012-1= 1-17-1353158953.gz
>> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ub= untu =A013583565 Nov 17 14:17
>> >> >> >>>>> > clickstream.log-2012-1= 1-17-1353161821.gz
>> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ub= untu 131296672 Nov 17 14:47
>> >> >> >>>>> > clickstream.log-2012-1= 1-17-1353163623
>> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ub= untu =A065648336 Nov 17 14:52
>> >> >> >>>>> > clickstream.log
>> >> >> >>>>> >
>> >> >> >>>>> > flume source directory= under /mnt/flume:
>> >> >> >>>>> > drwxrwxr-x 2 ubuntu ub= untu 4096 Nov 17 14:48 clickstream
>> >> >> >>>>> >
>> >> >> >>>>> > flume source files: >> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ub= untu 131296672 Nov 17 13:29
>> >> >> >>>>> > clickstream.log-2012-1= 1-17-1353158953.COMPLETED
>> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ub= untu 196945008 Nov 17 14:17
>> >> >> >>>>> > clickstream.log-2012-1= 1-17-1353161821.COMPLETED
>> >> >> >>>>> > -rw-rw-r-- 1 ubuntu ub= untu 131296672 Nov 17 14:47
>> >> >> >>>>> > clickstream.log-2012-1= 1-17-1353163623.COMPLETED
>> >> >> >>>>> >
>> >> >> >>>>> > Any insight would be a= ppreciated.
>> >> >> >>>>> >
>> >> >> >>>>> > Regards,
>> >> >> >>>>> >
>> >> >> >>>>> > Dan
>> >> >> >>>>>
>> >> >> >>>>>
>> >> >> >>>>>
>> >> >> >>>>> --
>> >> >> >>>>> Apache MRUnit - Unit testin= g MapReduce -
>> >> >> >>>>> http://incubator.apache.org/mrunit/<= /a>
>> >> >> >>>>>
>> >> >> >>>>>
>> >> >> >>>>
>> >> >> >>>
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Apache MRUnit - Unit testing MapReduce - >> >> >> >
http://incubator.apache.org/mrunit/
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Apache MRUnit - Unit testing MapReduce -
>> >> >> http://incubator.apache.org/mrunit/
>> >>
>> >>
>> >>
>> >> --
>> >> Apache MRUnit - Unit testing MapReduce -
>> >> http://incubator.apache.org/mrunit/
>> >
>> >
>>
>>
>>
>> --
>> Apache MRUnit - Unit testing MapReduce -
>> = http://incubator.apache.org/mrunit/
>
>



--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

--f46d04462dd4fd78fe04cef2b935--