flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: HDFS Sink keeps .tmp files and closes with exception
Date Thu, 18 Oct 2012 23:00:18 GMT
Nishant, 

CDH4+ Flume is built against Hadoop-2, and may not work correctly against Hadoop-1.x, since
Hadoop's interfaces changed in the mean time. You could also use Apache Flume-1.2.0 or the
upcoming Apache Flume-1.3.0 directly against Hadoop-1.x without issues, as they are built
against Hadoop-1.x. 


Thanks,
Hari

-- 
Hari Shreedharan


On Thursday, October 18, 2012 at 1:18 PM, Nishant Neeraj wrote:

> I am working on a POC using 
> > flume-ng version Flume 1.2.0-cdh4.1.1
> > Hadoop 1.0.4
> > 
> The config looks like this
> 
> #Flume agent configuration
> agent1.sources = avroSource1
> agent1.sinks = fileSink1
> agent1.channels = memChannel1
> 
> agent1.sources.avroSource1.type = avro 
> agent1.sources.avroSource1.channels = memChannel1
> agent1.sources.avroSource1.bind = 0.0.0.0
> agent1.sources.avroSource1.port = 4545
> 
> agent1.sources.avroSource1.interceptors = b 
> agent1.sources.avroSource1.interceptors.b.type = org.apache.flume.interceptor.TimestampInterceptor$Builder
> 
> agent1.sinks.fileSink1.type = hdfs
> agent1.sinks.fileSink1.channel = memChannel1
> agent1.sinks.fileSink1.hdfs.path = /flume/agg1/%y-%m-%d
> agent1.sinks.fileSink1.hdfs.filePrefix = agg
> agent1.sinks.fileSink1.hdfs.rollInterval = 0
> agent1.sinks.fileSink1.hdfs.rollSize = 0
> agent1.sinks.fileSink1.hdfs.rollCount = 0
> agent1.sinks.fileSink1.hdfs.fileType = DataStream
> agent1.sinks.fileSink1.hdfs.writeFormat = Text
> 
> 
> agent1.channels.memChannel1.type = memory 
> agent1.channels.memChannel1.capacity = 1000
> agent1.channels.memChannel1.transactionCapacity = 1000
> 
> 
> 
> Basically, I do not want to roll the file at all. I am just wanting to tail and watch
the show from Hadoop UI. The problem is it does not work. The console keeps saying, 
> 
> agg.1350590350462.tmp 0 KB    2012-10-18 19:59
> 
> Flume console shows events getting pushes. When I stop the flume,  I see the file gets
populated, but the '.tmp' is still in the file name. And I see this exception on close. 
> 
> 2012-10-18 20:06:49,315 (hdfs-fileSink1-call-runner-8) [DEBUG - org.apache.flume.sink.hdfs.BucketWriter.doClose(BucketWriter.java:254)]
Closing /flume/agg1/12-10-18/agg.1350590350462.tmp
> 2012-10-18 20:06:49,316 (hdfs-fileSink1-call-runner-8) [WARN - org.apache.flume.sink.hdfs.BucketWriter.doClose(BucketWriter.java:260)]
failed to close() HDFSWriter for file (/flume/agg1/12-10-18/agg.1350590350462.tmp). Exception
follows.
> java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:264)
> at org.apache.hadoop.hdfs.DFSClient.access$1100(DFSClient.java:74)
> at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3667)
> at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
> at org.apache.flume.sink.hdfs.HDFSDataStream.close(HDFSDataStream.java:103)
> at org.apache.flume.sink.hdfs.BucketWriter.doClose(BucketWriter.java:257)
> at org.apache.flume.sink.hdfs.BucketWriter.access$400(BucketWriter.java:50)
> at org.apache.flume.sink.hdfs.BucketWriter$3.run(BucketWriter.java:243)
> at org.apache.flume.sink.hdfs.BucketWriter$3.run(BucketWriter.java:240)
> at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:127)
> at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:240)
> at org.apache.flume.sink.hdfs.HDFSEventSink$3.call(HDFSEventSink.java:748)
> at org.apache.flume.sink.hdfs.HDFSEventSink$3.call(HDFSEventSink.java:745)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)
> 
> 
> 
> Thanks
> Nishant
> 
> 



Mime
View raw message