flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mahendran m <mahendra...@hotmail.com>
Subject Spooling source
Date Wed, 11 Feb 2015 04:17:13 GMT
Hi ,
I am moving logs from local machine to HDFS server using flume with spooling directory. Each
log contain lacks of lines 
My use case is below  
Log file name foldername-filename-timestamp.suffix  example file name is LogFiles-Log1-1463238298.log
my CONF is below 
a1.sinks = k1a1.channels = c1
#the source
a1.sources.r1.type = spooldira1.sources.r1.spoolDir  = F:\\SpoolingDirectorya1.sources.r1.deletePolicy=immediatea1.sources.r1.fileHeader
= truea1.sources.r1.interceptors = i1a1.sources.r1.interceptors.i1.type = com.company.CustomInterceptor.CustomInterceptor$Builder
#the sinka1.sinks.k1.type = hdfsa1.sinks.k1.hdfs.fileType = DataStreama1.sinks.k1.hdfs.fileSuffix=
.txta1.sinks.k1.hdfs.path  = hdfs://localhost:9000/spoolingdirectory/{foldername}
#Channela1.channels.c1.type = memorya1.channels.c1.capacity = 10000a1.channels.c1.transactionCapacity
= 1000
#Flowa1.sources.r1.channels = c1a1.sinks.k1.channel = c1

in the custom interceptor we will process the file hear and extract the folder name and add
this as {foldername} header it is use in hdfspath. What problem we are facing is  for single
file with lacks line this interceptor extract the same folder name for lacks of time  this
will leads very high performance degradation. 
Is there any way to handle my case without performing the same file header for lacks time

View raw message