flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bean Edwards <edwardsb...@gmail.com>
Subject Re: HDFS IO Error
Date Wed, 26 Mar 2014 03:20:10 GMT
we got the same problem!.


On Wed, Mar 26, 2014 at 2:43 AM, Abraham Fine <abe@brightroll.com> wrote:

> Hello-
>
> We have Flume agents running 1.4.0 that sink to HDFS (version
> 2.0.0-cdh4.2.1).
>
> Exceptions start occurring at the same time across our Flume agentswhen a
> datanode in HDFS goes down. We did not have this issue whilerunning Flume
> 1.3.
>
> We noticed a similar issue posted on the mailing list herehttp://
> mail-archives.apache.org/mod_mbox/flume-user/201307.mbox/%3CCAPZq-
> vkmDGptbOWEAF+rE-1neibUtQ36+EHqukn5B7FUM4QAyA@mail.gmail.com%3E <
> http://mail-archives.apache.org/mod_mbox/flume-user/201307.mbox/%3CCAPZq-
> vkmDGptbOWEAF+rE-1neibUtQ36+EHqukn5B7FUM4QAyA@mail.gmail.com%3E>and on
> JIRAhttps://issues.apache.org/jira/browse/FLUME-2261 <
> https://issues.apache.org/jira/browse/FLUME-2261>but couldnot find a
> solution.
>
> We have noticed the following in the Flume logs:
>
> WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor]
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:418)  - HDFS IO
> error
> java.io.IOException: Callable timed out after 20000 ms on file <FILEPATH> :
>         at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(
> BucketWriter.java:550)
>         at org.apache.flume.sink.hdfs.BucketWriter.doFlush(
> BucketWriter.java:353)
>         at org.apache.flume.sink.hdfs.BucketWriter.flush(
> BucketWriter.java:319)
>         at org.apache.flume.sink.hdfs.HDFSEventSink.process(
> HDFSEventSink.java:405)
>         at org.apache.flume.sink.DefaultSinkProcessor.process(
> DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.
> java:147)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.TimeoutException
>         at java.util.concurrent.FutureTask$Sync.innerGet(
> FutureTask.java:228)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:91)
>         at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(
> BucketWriter.java:543)
>         ... 6 more
>
> This is usually followed by:
>
> WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor]
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:418)  - HDFS IO
> error
> java.io.IOException: This bucket writer was closed due to idling and
> this handle is thus no longer valid
>         at org.apache.flume.sink.hdfs.BucketWriter.append(
> BucketWriter.java:380)
>         at org.apache.flume.sink.hdfs.HDFSEventSink.process(
> HDFSEventSink.java:392)
>         at org.apache.flume.sink.DefaultSinkProcessor.process(
> DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.
> java:147)
>         at java.lang.Thread.run(Thread.java:662)
>
> When these exceptions occur, the HDFS sink does not close files. Weoften
> end up with multi-gigabyte files in HDFS.
>
> Our sink configuration:
>
> agentX.sinks.hdfs-sinkX-1.channel = chX
> agentX.sinks.hdfs-sinkX-1.type = hdfs
> agentX.sinks.hdfs-sinkX-1.hdfs.path = <FILEPATH>
> agentX.sinks.hdfs-sinkX-1.hdfs.filePrefix = event
> agentX.sinks.hdfs-sinkX-1.hdfs.writeFormat = Text
> agentX.sinks.hdfs-sinkX-1.hdfs.rollInterval = 120
> agentX.sinks.hdfs-sinkX-1.hdfs.idleTimeout= 180
> agentX.sinks.hdfs-sinkX-1.hdfs.rollCount = 0
> agentX.sinks.hdfs-sinkX-1.hdfs.rollSize = 0
> agentX.sinks.hdfs-sinkX-1.hdfs.fileType = DataStream
> agentX.sinks.hdfs-sinkX-1.hdfs.batchSize = 24000
> agentX.sinks.hdfs-sinkX-1.hdfs.txnEventSize = 24000
> agentX.sinks.hdfs-sinkX-1.hdfs.callTimeout = 20000
> agentX.sinks.hdfs-sinkX-1.hdfs.threadsPoolSize = 1
>
>
> The file paths are unique to each sink.
>
> Thank you for your help.
>
> --
> Abraham Fine | Software Engineer
> BrightRoll, Inc. | Smart Video Advertising |www.brightroll.com <
> http://www.brightroll.com/>
>

Mime
View raw message