flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SaravanaKumar TR <saran0081...@gmail.com>
Subject Re: Flume stops processing event after a while
Date Wed, 16 Jul 2014 08:46:59 GMT
Hi Natty,

While looking further , i could see memory channal stops if a line comes
with greater than 2 MB.Let me know which parameter helps us to define max
event size of about 3 MB.


On 16 July 2014 12:46, SaravanaKumar TR <saran0081986@gmail.com> wrote:

> I am asking point 1 , because in some cases  I could see a line in logfile
> around 2 MB.So i need to know what mamimum event size.How to measure it?
>
>
>
>
> On 16 July 2014 10:18, SaravanaKumar TR <saran0081986@gmail.com> wrote:
>
>> Hi Natty,
>>
>> Please help me to get the answers for the below queries.
>>
>> 1,In case of exec source , (tail -F <logfile>) , is that each line in
>> file is considered to be a single event ?
>> If suppose a line is considered to be a event , what is that maximum size
>> of event supported by flume?I mean maximum characters in a line supported?
>> 2.When event stop processing , I am not seeing "tail -F" command running
>> in the background.
>> I have used option like "a1.sources.r1.restart = true
>> a1.sources.r1.logStdErr = true"..
>> Does these config will not send any errors to flume.log if any issues in
>> tail?
>> Will this config doesnt try to restart the "tail -F" if its not running
>> in the background.
>>
>> 3.Does flume supports all formats of data in logfile or it has any
>> predefined data formats..
>>
>> Please help me with these to understand better..
>>
>>
>>
>> On 16 July 2014 00:56, Jonathan Natkins <natty@streamsets.com> wrote:
>>
>>> Saravana,
>>>
>>> Everything here looks pretty sane. Do you have a record of the events
>>> that came in leading up to the agent stopping collection? If you can
>>> provide the last file created by the agent, and ideally whatever events had
>>> come in, but not been written out to your HDFS sink, it might be possible
>>> for me to reproduce this issue. Would it be possible to get some sample
>>> data from you?
>>>
>>> Thanks,
>>> Natty
>>>
>>>
>>> On Tue, Jul 15, 2014 at 10:26 AM, SaravanaKumar TR <
>>> saran0081986@gmail.com> wrote:
>>>
>>>> Hi Natty ,
>>>>
>>>> Just to understand , at present my settings is as
>>>> "flume.root.logger=INFO,LOGFILE"
>>>> in log4j.properties , do you want me to change it to
>>>> "flume.root.logger=DEBUG,LOGFILE" and restart the agent.
>>>>
>>>> But when I start agent , I am already starting with below command.I
>>>> guess i am using DEBUG already but not in config file , while starting
>>>> agent.
>>>>
>>>> ../bin/flume-ng agent -c /d0/flume/conf -f
>>>> /d0/flume/conf/flume-conf.properties -n a1 -Dflume.root.logger=DEBUG,LOGFILE
>>>>
>>>> If I do some changes in config "flume-conf.properties" or restart the
>>>> agent , it works again and starts collecting the data.
>>>>
>>>> currently all my logs move to flume.log , I dont see any exception .
>>>>
>>>> cat flume.log | grep "Exception"  doesnt show any.
>>>>
>>>>
>>>> On 15 July 2014 22:24, Jonathan Natkins <natty@streamsets.com> wrote:
>>>>
>>>>> Hi Saravana,
>>>>>
>>>>> Our best bet on figuring out what's going on here may be to turn on
>>>>> the debug logging. What I would recommend is stopping your agents, and
>>>>> modifying the log4j properties to turn on DEBUG logging for the root
>>>>> logger, and then restart the agents. Once the agent stops producing new
>>>>> events, send out the logs and I'll be happy to take a look over them.
>>>>>
>>>>> Does the system begin working again if you restart the agents? Have
>>>>> you noticed any other events correlated with the agent stopping collecting
>>>>> events? Maybe a spike in events or something like that? And for my own
>>>>> peace of mind, if you run `cat /var/log/flume-ng/* | grep "Exception"`,
>>>>> does it bring anything back?
>>>>>
>>>>> Thanks!
>>>>> Natty
>>>>>
>>>>>
>>>>> On Tue, Jul 15, 2014 at 2:55 AM, SaravanaKumar TR <
>>>>> saran0081986@gmail.com> wrote:
>>>>>
>>>>>> Hi Natty,
>>>>>>
>>>>>> This is my entire config file.
>>>>>>
>>>>>> # Name the components on this agent
>>>>>> a1.sources = r1
>>>>>> a1.sinks = k1
>>>>>> a1.channels = c1
>>>>>>
>>>>>> # Describe/configure the source
>>>>>> a1.sources.r1.type = exec
>>>>>> a1.sources.r1.command = tail -F /data/logs/test_log
>>>>>> a1.sources.r1.restart = true
>>>>>> a1.sources.r1.logStdErr = true
>>>>>>
>>>>>> #a1.sources.r1.batchSize = 2
>>>>>>
>>>>>> a1.sources.r1.interceptors = i1
>>>>>> a1.sources.r1.interceptors.i1.type = regex_filter
>>>>>> a1.sources.r1.interceptors.i1.regex = resuming normal
>>>>>> operations|Received|Response
>>>>>>
>>>>>> #a1.sources.r1.interceptors = i2
>>>>>> #a1.sources.r1.interceptors.i2.type = timestamp
>>>>>> #a1.sources.r1.interceptors.i2.preserveExisting = true
>>>>>>
>>>>>> # Describe the sink
>>>>>> a1.sinks.k1.type = hdfs
>>>>>> a1.sinks.k1.hdfs.path = hdfs://
>>>>>> testing.sck.com:9000/running/test.sck/date=%Y-%m-%d
>>>>>> a1.sinks.k1.hdfs.writeFormat = Text
>>>>>> a1.sinks.k1.hdfs.fileType = DataStream
>>>>>> a1.sinks.k1.hdfs.filePrefix = events-
>>>>>> a1.sinks.k1.hdfs.rollInterval = 600
>>>>>> ##need to run hive query randomly to check teh long running process
,
>>>>>> so we  need to commit events in hdfs files regularly
>>>>>> a1.sinks.k1.hdfs.rollCount = 0
>>>>>> a1.sinks.k1.hdfs.batchSize = 10
>>>>>> a1.sinks.k1.hdfs.rollSize = 0
>>>>>> a1.sinks.k1.hdfs.useLocalTimeStamp = true
>>>>>>
>>>>>> # Use a channel which buffers events in memory
>>>>>> a1.channels.c1.type = memory
>>>>>> a1.channels.c1.capacity = 10000
>>>>>> a1.channels.c1.transactionCapacity = 10000
>>>>>>
>>>>>> # Bind the source and sink to the channel
>>>>>> a1.sources.r1.channels = c1
>>>>>> a1.sinks.k1.channel = c1
>>>>>>
>>>>>>
>>>>>> On 14 July 2014 22:54, Jonathan Natkins <natty@streamsets.com>
wrote:
>>>>>>
>>>>>>> Hi Saravana,
>>>>>>>
>>>>>>> What does your sink configuration look like?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Natty
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jul 11, 2014 at 11:05 PM, SaravanaKumar TR <
>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>
>>>>>>>> Assuming each line in the logfile is considered as a event
for
>>>>>>>> flume ,
>>>>>>>>
>>>>>>>> 1.Do we have any maximum size of event defined for memory/file
>>>>>>>> channel.like any maximum no of characters in a line.
>>>>>>>> 2.Does flume supports all formats of data to be processed
as events
>>>>>>>> or do we have any limitation.
>>>>>>>>
>>>>>>>> I am just still trying to understanding why the flume stops
>>>>>>>> processing events after sometime.
>>>>>>>>
>>>>>>>> Can someone please help me out here.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> saravana
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11 July 2014 17:49, SaravanaKumar TR <saran0081986@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi ,
>>>>>>>>>
>>>>>>>>> I am new to flume and  using Apache Flume 1.5.0. Quick
setup
>>>>>>>>> explanation here.
>>>>>>>>>
>>>>>>>>> Source:exec , tail –F command for a logfile.
>>>>>>>>>
>>>>>>>>> Channel: tried with both Memory & file channel
>>>>>>>>>
>>>>>>>>> Sink: HDFS
>>>>>>>>>
>>>>>>>>> When flume starts , processing events happens properly
and its
>>>>>>>>> moved to hdfs without any issues.
>>>>>>>>>
>>>>>>>>> But after sometime flume suddenly stops sending events
to HDFS.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I am not seeing any errors in logfile flume.log as well.Please
let
>>>>>>>>> me know if I am missing any configuration here.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Below is the channel configuration defined and I left
the
>>>>>>>>> remaining to be default values.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> a1.channels.c1.type = FILE
>>>>>>>>>
>>>>>>>>> a1.channels.c1.transactionCapacity = 100000
>>>>>>>>>
>>>>>>>>> a1.channels.c1.capacity = 10000000
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Saravana
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message