flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SaravanaKumar TR <saran0081...@gmail.com>
Subject Re: Flume stops processing event after a while
Date Wed, 16 Jul 2014 10:03:34 GMT
I guess i am using defaulk values , from running flume i could see these
lines  "/cv/jvendor/bin/java -Xmx20m
-Dflume.root.logger=DEBUG,LOGFILE......"

so i guess it takes 20 mb as agent flume memory.
My RAM is 128 GB.So please suggest how much can i assign as heap memory and
where to define it.


On 16 July 2014 15:05, Jonathan Natkins <natty@streamsets.com> wrote:

> Hey Saravana,
>
> I'm attempting to reproduce this, but do you happen to know what the Java
> heap size is for your Flume agent? This information leads me to believe
> that you don't have enough memory allocated to the agent, which you may
> need to do with the -Xmx parameter when you start up your agent. That
> aside, you can set the byteCapacity parameter on the memory channel to
> specify how much memory it is allowed to use. It should default to 80% of
> the Java heap size, but if your heap is too small, this might be a cause of
> errors.
>
> Does anything get written to the log when you try to pass in an event of
> this size?
>
> Thanks,
> Natty
>
>
> On Wed, Jul 16, 2014 at 1:46 AM, SaravanaKumar TR <saran0081986@gmail.com>
> wrote:
>
>> Hi Natty,
>>
>> While looking further , i could see memory channal stops if a line comes
>> with greater than 2 MB.Let me know which parameter helps us to define max
>> event size of about 3 MB.
>>
>>
>> On 16 July 2014 12:46, SaravanaKumar TR <saran0081986@gmail.com> wrote:
>>
>>> I am asking point 1 , because in some cases  I could see a line in
>>> logfile around 2 MB.So i need to know what mamimum event size.How to
>>> measure it?
>>>
>>>
>>>
>>>
>>> On 16 July 2014 10:18, SaravanaKumar TR <saran0081986@gmail.com> wrote:
>>>
>>>> Hi Natty,
>>>>
>>>> Please help me to get the answers for the below queries.
>>>>
>>>> 1,In case of exec source , (tail -F <logfile>) , is that each line
in
>>>> file is considered to be a single event ?
>>>> If suppose a line is considered to be a event , what is that maximum
>>>> size of event supported by flume?I mean maximum characters in a line
>>>> supported?
>>>> 2.When event stop processing , I am not seeing "tail -F" command
>>>> running in the background.
>>>> I have used option like "a1.sources.r1.restart = true
>>>> a1.sources.r1.logStdErr = true"..
>>>> Does these config will not send any errors to flume.log if any issues
>>>> in tail?
>>>> Will this config doesnt try to restart the "tail -F" if its not running
>>>> in the background.
>>>>
>>>> 3.Does flume supports all formats of data in logfile or it has any
>>>> predefined data formats..
>>>>
>>>> Please help me with these to understand better..
>>>>
>>>>
>>>>
>>>> On 16 July 2014 00:56, Jonathan Natkins <natty@streamsets.com> wrote:
>>>>
>>>>> Saravana,
>>>>>
>>>>> Everything here looks pretty sane. Do you have a record of the events
>>>>> that came in leading up to the agent stopping collection? If you can
>>>>> provide the last file created by the agent, and ideally whatever events
had
>>>>> come in, but not been written out to your HDFS sink, it might be possible
>>>>> for me to reproduce this issue. Would it be possible to get some sample
>>>>> data from you?
>>>>>
>>>>> Thanks,
>>>>> Natty
>>>>>
>>>>>
>>>>> On Tue, Jul 15, 2014 at 10:26 AM, SaravanaKumar TR <
>>>>> saran0081986@gmail.com> wrote:
>>>>>
>>>>>> Hi Natty ,
>>>>>>
>>>>>> Just to understand , at present my settings is as
>>>>>> "flume.root.logger=INFO,LOGFILE"
>>>>>> in log4j.properties , do you want me to change it to
>>>>>> "flume.root.logger=DEBUG,LOGFILE" and restart the agent.
>>>>>>
>>>>>> But when I start agent , I am already starting with below command.I
>>>>>> guess i am using DEBUG already but not in config file , while starting
>>>>>> agent.
>>>>>>
>>>>>> ../bin/flume-ng agent -c /d0/flume/conf -f
>>>>>> /d0/flume/conf/flume-conf.properties -n a1 -Dflume.root.logger=DEBUG,LOGFILE
>>>>>>
>>>>>> If I do some changes in config "flume-conf.properties" or restart
the
>>>>>> agent , it works again and starts collecting the data.
>>>>>>
>>>>>> currently all my logs move to flume.log , I dont see any exception
.
>>>>>>
>>>>>> cat flume.log | grep "Exception"  doesnt show any.
>>>>>>
>>>>>>
>>>>>> On 15 July 2014 22:24, Jonathan Natkins <natty@streamsets.com>
wrote:
>>>>>>
>>>>>>> Hi Saravana,
>>>>>>>
>>>>>>> Our best bet on figuring out what's going on here may be to turn
on
>>>>>>> the debug logging. What I would recommend is stopping your agents,
and
>>>>>>> modifying the log4j properties to turn on DEBUG logging for the
root
>>>>>>> logger, and then restart the agents. Once the agent stops producing
new
>>>>>>> events, send out the logs and I'll be happy to take a look over
them.
>>>>>>>
>>>>>>> Does the system begin working again if you restart the agents?
Have
>>>>>>> you noticed any other events correlated with the agent stopping
collecting
>>>>>>> events? Maybe a spike in events or something like that? And for
my own
>>>>>>> peace of mind, if you run `cat /var/log/flume-ng/* | grep "Exception"`,
>>>>>>> does it bring anything back?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Natty
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 15, 2014 at 2:55 AM, SaravanaKumar TR <
>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Natty,
>>>>>>>>
>>>>>>>> This is my entire config file.
>>>>>>>>
>>>>>>>> # Name the components on this agent
>>>>>>>> a1.sources = r1
>>>>>>>> a1.sinks = k1
>>>>>>>> a1.channels = c1
>>>>>>>>
>>>>>>>> # Describe/configure the source
>>>>>>>> a1.sources.r1.type = exec
>>>>>>>> a1.sources.r1.command = tail -F /data/logs/test_log
>>>>>>>> a1.sources.r1.restart = true
>>>>>>>> a1.sources.r1.logStdErr = true
>>>>>>>>
>>>>>>>> #a1.sources.r1.batchSize = 2
>>>>>>>>
>>>>>>>> a1.sources.r1.interceptors = i1
>>>>>>>> a1.sources.r1.interceptors.i1.type = regex_filter
>>>>>>>> a1.sources.r1.interceptors.i1.regex = resuming normal
>>>>>>>> operations|Received|Response
>>>>>>>>
>>>>>>>> #a1.sources.r1.interceptors = i2
>>>>>>>> #a1.sources.r1.interceptors.i2.type = timestamp
>>>>>>>> #a1.sources.r1.interceptors.i2.preserveExisting = true
>>>>>>>>
>>>>>>>> # Describe the sink
>>>>>>>> a1.sinks.k1.type = hdfs
>>>>>>>> a1.sinks.k1.hdfs.path = hdfs://
>>>>>>>> testing.sck.com:9000/running/test.sck/date=%Y-%m-%d
>>>>>>>> a1.sinks.k1.hdfs.writeFormat = Text
>>>>>>>> a1.sinks.k1.hdfs.fileType = DataStream
>>>>>>>> a1.sinks.k1.hdfs.filePrefix = events-
>>>>>>>> a1.sinks.k1.hdfs.rollInterval = 600
>>>>>>>> ##need to run hive query randomly to check teh long running
process
>>>>>>>> , so we  need to commit events in hdfs files regularly
>>>>>>>> a1.sinks.k1.hdfs.rollCount = 0
>>>>>>>> a1.sinks.k1.hdfs.batchSize = 10
>>>>>>>> a1.sinks.k1.hdfs.rollSize = 0
>>>>>>>> a1.sinks.k1.hdfs.useLocalTimeStamp = true
>>>>>>>>
>>>>>>>> # Use a channel which buffers events in memory
>>>>>>>> a1.channels.c1.type = memory
>>>>>>>> a1.channels.c1.capacity = 10000
>>>>>>>> a1.channels.c1.transactionCapacity = 10000
>>>>>>>>
>>>>>>>> # Bind the source and sink to the channel
>>>>>>>> a1.sources.r1.channels = c1
>>>>>>>> a1.sinks.k1.channel = c1
>>>>>>>>
>>>>>>>>
>>>>>>>> On 14 July 2014 22:54, Jonathan Natkins <natty@streamsets.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Saravana,
>>>>>>>>>
>>>>>>>>> What does your sink configuration look like?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Natty
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Jul 11, 2014 at 11:05 PM, SaravanaKumar TR <
>>>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Assuming each line in the logfile is considered as
a event for
>>>>>>>>>> flume ,
>>>>>>>>>>
>>>>>>>>>> 1.Do we have any maximum size of event defined for
memory/file
>>>>>>>>>> channel.like any maximum no of characters in a line.
>>>>>>>>>> 2.Does flume supports all formats of data to be processed
as
>>>>>>>>>> events or do we have any limitation.
>>>>>>>>>>
>>>>>>>>>> I am just still trying to understanding why the flume
stops
>>>>>>>>>> processing events after sometime.
>>>>>>>>>>
>>>>>>>>>> Can someone please help me out here.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> saravana
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 11 July 2014 17:49, SaravanaKumar TR <saran0081986@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi ,
>>>>>>>>>>>
>>>>>>>>>>> I am new to flume and  using Apache Flume 1.5.0.
Quick setup
>>>>>>>>>>> explanation here.
>>>>>>>>>>>
>>>>>>>>>>> Source:exec , tail –F command for a logfile.
>>>>>>>>>>>
>>>>>>>>>>> Channel: tried with both Memory & file channel
>>>>>>>>>>>
>>>>>>>>>>> Sink: HDFS
>>>>>>>>>>>
>>>>>>>>>>> When flume starts , processing events happens
properly and its
>>>>>>>>>>> moved to hdfs without any issues.
>>>>>>>>>>>
>>>>>>>>>>> But after sometime flume suddenly stops sending
events to HDFS.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I am not seeing any errors in logfile flume.log
as well.Please
>>>>>>>>>>> let me know if I am missing any configuration
here.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Below is the channel configuration defined and
I left the
>>>>>>>>>>> remaining to be default values.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> a1.channels.c1.type = FILE
>>>>>>>>>>>
>>>>>>>>>>> a1.channels.c1.transactionCapacity = 100000
>>>>>>>>>>>
>>>>>>>>>>> a1.channels.c1.capacity = 10000000
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Saravana
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message