flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SaravanaKumar TR <saran0081...@gmail.com>
Subject Re: Flume stops processing event after a while
Date Thu, 17 Jul 2014 05:51:47 GMT
Okay thanks , So for 128 GB , I will allocate 1 GB as a heap memory for
flume agent.

But I am surprised why there was no error registered for this memory issues
in log file (flume.log).

Do i need to check in any other logs?


On 16 July 2014 21:55, Jonathan Natkins <natty@streamsets.com> wrote:

> That's definitely your problem. 20MB is way too low for this. Depending on
> the other processes you're running with your system, the amount of memory
> you'll need will vary, but I'd recommend at least 1GB. You should define it
> exactly where it's defined right now, so instead of the current command,
> you can run:
>
> "/cv/jvendor/bin/java -Xmx1g -Dflume.root.logger=DEBUG,LOGFILE......"
>
>
> On Wed, Jul 16, 2014 at 3:03 AM, SaravanaKumar TR <saran0081986@gmail.com>
> wrote:
>
>> I guess i am using defaulk values , from running flume i could see these
>> lines  "/cv/jvendor/bin/java -Xmx20m
>> -Dflume.root.logger=DEBUG,LOGFILE......"
>>
>> so i guess it takes 20 mb as agent flume memory.
>> My RAM is 128 GB.So please suggest how much can i assign as heap memory
>> and where to define it.
>>
>>
>> On 16 July 2014 15:05, Jonathan Natkins <natty@streamsets.com> wrote:
>>
>>> Hey Saravana,
>>>
>>> I'm attempting to reproduce this, but do you happen to know what the
>>> Java heap size is for your Flume agent? This information leads me to
>>> believe that you don't have enough memory allocated to the agent, which you
>>> may need to do with the -Xmx parameter when you start up your agent. That
>>> aside, you can set the byteCapacity parameter on the memory channel to
>>> specify how much memory it is allowed to use. It should default to 80% of
>>> the Java heap size, but if your heap is too small, this might be a cause of
>>> errors.
>>>
>>> Does anything get written to the log when you try to pass in an event of
>>> this size?
>>>
>>> Thanks,
>>> Natty
>>>
>>>
>>> On Wed, Jul 16, 2014 at 1:46 AM, SaravanaKumar TR <
>>> saran0081986@gmail.com> wrote:
>>>
>>>> Hi Natty,
>>>>
>>>> While looking further , i could see memory channal stops if a line
>>>> comes with greater than 2 MB.Let me know which parameter helps us to define
>>>> max event size of about 3 MB.
>>>>
>>>>
>>>> On 16 July 2014 12:46, SaravanaKumar TR <saran0081986@gmail.com> wrote:
>>>>
>>>>> I am asking point 1 , because in some cases  I could see a line in
>>>>> logfile around 2 MB.So i need to know what mamimum event size.How to
>>>>> measure it?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 16 July 2014 10:18, SaravanaKumar TR <saran0081986@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Natty,
>>>>>>
>>>>>> Please help me to get the answers for the below queries.
>>>>>>
>>>>>> 1,In case of exec source , (tail -F <logfile>) , is that each
line in
>>>>>> file is considered to be a single event ?
>>>>>> If suppose a line is considered to be a event , what is that maximum
>>>>>> size of event supported by flume?I mean maximum characters in a line
>>>>>> supported?
>>>>>> 2.When event stop processing , I am not seeing "tail -F" command
>>>>>> running in the background.
>>>>>> I have used option like "a1.sources.r1.restart = true
>>>>>> a1.sources.r1.logStdErr = true"..
>>>>>> Does these config will not send any errors to flume.log if any issues
>>>>>> in tail?
>>>>>> Will this config doesnt try to restart the "tail -F" if its not
>>>>>> running in the background.
>>>>>>
>>>>>> 3.Does flume supports all formats of data in logfile or it has any
>>>>>> predefined data formats..
>>>>>>
>>>>>> Please help me with these to understand better..
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 16 July 2014 00:56, Jonathan Natkins <natty@streamsets.com>
wrote:
>>>>>>
>>>>>>> Saravana,
>>>>>>>
>>>>>>> Everything here looks pretty sane. Do you have a record of the
>>>>>>> events that came in leading up to the agent stopping collection?
If you can
>>>>>>> provide the last file created by the agent, and ideally whatever
events had
>>>>>>> come in, but not been written out to your HDFS sink, it might
be possible
>>>>>>> for me to reproduce this issue. Would it be possible to get some
sample
>>>>>>> data from you?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Natty
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 15, 2014 at 10:26 AM, SaravanaKumar TR <
>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Natty ,
>>>>>>>>
>>>>>>>> Just to understand , at present my settings is as
>>>>>>>> "flume.root.logger=INFO,LOGFILE"
>>>>>>>> in log4j.properties , do you want me to change it to
>>>>>>>> "flume.root.logger=DEBUG,LOGFILE" and restart the agent.
>>>>>>>>
>>>>>>>> But when I start agent , I am already starting with below
command.I
>>>>>>>> guess i am using DEBUG already but not in config file , while
starting
>>>>>>>> agent.
>>>>>>>>
>>>>>>>> ../bin/flume-ng agent -c /d0/flume/conf -f
>>>>>>>> /d0/flume/conf/flume-conf.properties -n a1 -Dflume.root.logger=DEBUG,LOGFILE
>>>>>>>>
>>>>>>>> If I do some changes in config "flume-conf.properties" or
restart
>>>>>>>> the agent , it works again and starts collecting the data.
>>>>>>>>
>>>>>>>> currently all my logs move to flume.log , I dont see any
exception .
>>>>>>>>
>>>>>>>> cat flume.log | grep "Exception"  doesnt show any.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 15 July 2014 22:24, Jonathan Natkins <natty@streamsets.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Saravana,
>>>>>>>>>
>>>>>>>>> Our best bet on figuring out what's going on here may
be to turn
>>>>>>>>> on the debug logging. What I would recommend is stopping
your agents, and
>>>>>>>>> modifying the log4j properties to turn on DEBUG logging
for the root
>>>>>>>>> logger, and then restart the agents. Once the agent stops
producing new
>>>>>>>>> events, send out the logs and I'll be happy to take a
look over them.
>>>>>>>>>
>>>>>>>>> Does the system begin working again if you restart the
agents?
>>>>>>>>> Have you noticed any other events correlated with the
agent stopping
>>>>>>>>> collecting events? Maybe a spike in events or something
like that? And for
>>>>>>>>> my own peace of mind, if you run `cat /var/log/flume-ng/*
| grep
>>>>>>>>> "Exception"`, does it bring anything back?
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>> Natty
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jul 15, 2014 at 2:55 AM, SaravanaKumar TR <
>>>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Natty,
>>>>>>>>>>
>>>>>>>>>> This is my entire config file.
>>>>>>>>>>
>>>>>>>>>> # Name the components on this agent
>>>>>>>>>> a1.sources = r1
>>>>>>>>>> a1.sinks = k1
>>>>>>>>>> a1.channels = c1
>>>>>>>>>>
>>>>>>>>>> # Describe/configure the source
>>>>>>>>>> a1.sources.r1.type = exec
>>>>>>>>>> a1.sources.r1.command = tail -F /data/logs/test_log
>>>>>>>>>> a1.sources.r1.restart = true
>>>>>>>>>> a1.sources.r1.logStdErr = true
>>>>>>>>>>
>>>>>>>>>> #a1.sources.r1.batchSize = 2
>>>>>>>>>>
>>>>>>>>>> a1.sources.r1.interceptors = i1
>>>>>>>>>> a1.sources.r1.interceptors.i1.type = regex_filter
>>>>>>>>>> a1.sources.r1.interceptors.i1.regex = resuming normal
>>>>>>>>>> operations|Received|Response
>>>>>>>>>>
>>>>>>>>>> #a1.sources.r1.interceptors = i2
>>>>>>>>>> #a1.sources.r1.interceptors.i2.type = timestamp
>>>>>>>>>> #a1.sources.r1.interceptors.i2.preserveExisting =
true
>>>>>>>>>>
>>>>>>>>>> # Describe the sink
>>>>>>>>>> a1.sinks.k1.type = hdfs
>>>>>>>>>> a1.sinks.k1.hdfs.path = hdfs://
>>>>>>>>>> testing.sck.com:9000/running/test.sck/date=%Y-%m-%d
>>>>>>>>>> a1.sinks.k1.hdfs.writeFormat = Text
>>>>>>>>>> a1.sinks.k1.hdfs.fileType = DataStream
>>>>>>>>>> a1.sinks.k1.hdfs.filePrefix = events-
>>>>>>>>>> a1.sinks.k1.hdfs.rollInterval = 600
>>>>>>>>>> ##need to run hive query randomly to check teh long
running
>>>>>>>>>> process , so we  need to commit events in hdfs files
regularly
>>>>>>>>>> a1.sinks.k1.hdfs.rollCount = 0
>>>>>>>>>> a1.sinks.k1.hdfs.batchSize = 10
>>>>>>>>>> a1.sinks.k1.hdfs.rollSize = 0
>>>>>>>>>> a1.sinks.k1.hdfs.useLocalTimeStamp = true
>>>>>>>>>>
>>>>>>>>>> # Use a channel which buffers events in memory
>>>>>>>>>> a1.channels.c1.type = memory
>>>>>>>>>> a1.channels.c1.capacity = 10000
>>>>>>>>>> a1.channels.c1.transactionCapacity = 10000
>>>>>>>>>>
>>>>>>>>>> # Bind the source and sink to the channel
>>>>>>>>>> a1.sources.r1.channels = c1
>>>>>>>>>> a1.sinks.k1.channel = c1
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 14 July 2014 22:54, Jonathan Natkins <natty@streamsets.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Saravana,
>>>>>>>>>>>
>>>>>>>>>>> What does your sink configuration look like?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Natty
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jul 11, 2014 at 11:05 PM, SaravanaKumar
TR <
>>>>>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Assuming each line in the logfile is considered
as a event for
>>>>>>>>>>>> flume ,
>>>>>>>>>>>>
>>>>>>>>>>>> 1.Do we have any maximum size of event defined
for memory/file
>>>>>>>>>>>> channel.like any maximum no of characters
in a line.
>>>>>>>>>>>> 2.Does flume supports all formats of data
to be processed as
>>>>>>>>>>>> events or do we have any limitation.
>>>>>>>>>>>>
>>>>>>>>>>>> I am just still trying to understanding why
the flume stops
>>>>>>>>>>>> processing events after sometime.
>>>>>>>>>>>>
>>>>>>>>>>>> Can someone please help me out here.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> saravana
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 11 July 2014 17:49, SaravanaKumar TR <saran0081986@gmail.com
>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am new to flume and  using Apache Flume
1.5.0. Quick setup
>>>>>>>>>>>>> explanation here.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Source:exec , tail –F command for a
logfile.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Channel: tried with both Memory &
file channel
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sink: HDFS
>>>>>>>>>>>>>
>>>>>>>>>>>>> When flume starts , processing events
happens properly and its
>>>>>>>>>>>>> moved to hdfs without any issues.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But after sometime flume suddenly stops
sending events to
>>>>>>>>>>>>> HDFS.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am not seeing any errors in logfile
flume.log as well.Please
>>>>>>>>>>>>> let me know if I am missing any configuration
here.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Below is the channel configuration defined
and I left the
>>>>>>>>>>>>> remaining to be default values.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> a1.channels.c1.type = FILE
>>>>>>>>>>>>>
>>>>>>>>>>>>> a1.channels.c1.transactionCapacity =
100000
>>>>>>>>>>>>>
>>>>>>>>>>>>> a1.channels.c1.capacity = 10000000
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Saravana
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message