flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Young <danoyo...@gmail.com>
Subject Re: .SpoolingFileLineReader warning....
Date Sat, 17 Nov 2012 16:33:02 GMT
logrotate config, in /etc/logrotate.d, ran from cron.hourly.

/var/log/clickstream/clickstream.log
{
  missingok
  rotate 3
  compress
  delaycompress
  copytruncate
  notifempty
  size 50M
  dateext
  dateformat -%Y-%m-%d-%s
  create 666 ubuntu ubuntu
  postrotate
  cp -p $1 /mnt/flume/clickstream/ 2>&1
  endscript
}



flume config:

# Name the components on this agent
agent1.sources = c1
agent1.sinks = c1s3
agent1.channels = ch1

# Describe/configure
agent1.sources.c1.type = org.apache.flume.source.SpoolDirectorySource
agent1.sources.c1.spoolDir = /mnt/flume/clickstream
agent1.sources.c1.fileHeader = false
agent1.sources.c1.interceptors = a b
agent1.sources.c1.interceptors.a.type =
org.apache.flume.interceptor.TimestampInterceptor$Builder
agent1.sources.c1.interceptors.b.type =
org.apache.flume.interceptor.HostInterceptor$Builder
agent1.sources.c1.interceptors.b.preserveExisting = false
agent1.sources.c1.interceptors.b.hostHeader = host


# Describe s3
agent1.sinks.c1s3.type = hdfs
agent1.sinks.c1s3.hdfs.path =
s3n://<super_secret_stuff_here>@<my_bucket>/clicks/%Y/%m
agent1.sinks.c1s3.hdfs.rollInterval = 300
agent1.sinks.c1s3.hdfs.rollSize = 0
agent1.sinks.c1s3.hdfs.rollCount = 0
agent1.sinks.c1s3.hdfs.batchSize = 400000
agent1.sinks.c1s3.hdfs.codeC = gzip
agent1.sinks.c1s3.hdfs.fileType = CompressedStream
agent1.sinks.c1s3.hdfs.writeFormat = Text
agent1.sinks.c1s3.hdfs.filePrefix = clicks-%Y-%m-%d-%H-%M-%{host}-
agent1.sinks.c1s3.hdfs.round = true
agent1.sinks.c1s3.hdfs.roundValue = 10
agent1.sinks.c1s3.hdfs.roundUnit = minute

# Use a channel which buffers events in memory
agent1.channels.ch1.type = file
agent1.channels.ch1.transactionCapacity = 400000
agent1.channels.ch1.capacity = 2000000
agent1.channels.ch1.checkpointDir = /mnt/flume/.flume/file-ch1/checkpoint
agent1.channels.ch1.dataDirs = /mnt/flume/.flume/file-ch1/data
agent1.channels.ch1.checkpointInterval = 30000


# Bind the source and sink to the channel
agent1.sources.c1.channels = ch1
agent1.sinks.c1s3.channel = ch1





On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland <brock@cloudera.com> wrote:

>  Ok, do you mind sharing your log rotate config to see if we can
> reproduce?
>
> --
> Brock Noland
> Sent with Sparrow <http://www.sparrowmailapp.com/?sig>
>
> On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote:
>
> Hey Brock,
>
> No I have not modified the conf while the agent was running.
>
> /mnt/flume is local. Note that this is running on an ec2 instance and the
> disk is the ephemeral drive, not EBS.
>
> Regards ,
>
> Dano
> On Nov 17, 2012 8:58 AM, "Brock Noland" <brock@cloudera.com> wrote:
>
> Hi,
>
> I highly doubt it's related to
> (https://issues.apache.org/jira/browse/FLUME-1721) but have you
> modified the configuration file since starting the agent?  If so, can
> you restart the agent and see if the error continues?
>
> Also, is /mnt/flume local disk or NAS?
>
> Brock
>
> On Sat, Nov 17, 2012 at 9:02 AM, Dan Young <danoyoung@gmail.com> wrote:
> > First a bit of context, I'm using logrotate to monitor and copy (cp -p)
> log
> > files to a flume spooling directory source.  So every hour, logrotate
> checks
> > for and copies a file from the source to the flume destination. I see the
> > following warning message in the flume logs.
> >
> >
> > 17 Nov 2012 14:47:07,682 WARN  [pool-10-thread-1]
> > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328)  -
> > Could not find file:
> > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623
> > java.io.FileNotFoundException:
> > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 (Permission
> > denied)
> > at java.io.FileInputStream.open(Native Method)
> > at java.io.FileInputStream.<init>(FileInputStream.java:138)
> > at java.io.FileReader.<init>(FileReader.java:72)
> > at
> >
> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322)
> > at
> >
> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172)
> > at
> >
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
> > at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> > at
> >
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> > at
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> > at
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> > at java.lang.Thread.run(Thread.java:722)
> >
> >
> > Although it appears that Flume processes the log, I'm curious why I''m
> > seeing this and if I have anything with permissions incorrect?
> >
> >
> >
> > Here's the permissions:
> >
> > source log directory under /var/log:
> > drwxrwxr-x 2 ubuntu    ubuntu   4096 Nov 17 14:47 clickstream
> >
> > source files:
> > -rw-rw-r-- 1 ubuntu ubuntu   9055750 Nov 17 13:29
> > clickstream.log-2012-11-17-1353158953.gz
> > -rw-rw-r-- 1 ubuntu ubuntu  13583565 Nov 17 14:17
> > clickstream.log-2012-11-17-1353161821.gz
> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47
> > clickstream.log-2012-11-17-1353163623
> > -rw-rw-r-- 1 ubuntu ubuntu  65648336 Nov 17 14:52 clickstream.log
> >
> > flume source directory under /mnt/flume:
> > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:48 clickstream
> >
> > flume source files:
> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 13:29
> > clickstream.log-2012-11-17-1353158953.COMPLETED
> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 17 14:17
> > clickstream.log-2012-11-17-1353161821.COMPLETED
> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47
> > clickstream.log-2012-11-17-1353163623.COMPLETED
> >
> > Any insight would be appreciated.
> >
> > Regards,
> >
> > Dan
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce -
> http://incubator.apache.org/mrunit/
>
>
>

Mime
View raw message