incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 陈镇海 <higher1...@gmail.com>
Subject Re: UDPAdaptor
Date Fri, 30 Dec 2011 07:35:45 GMT
Thanks for you help !  ^_^

在 2011年12月30日 下午3:16,Eric Yang <eric818@gmail.com> 写道:
> Data would be discarded if in memory queue is full.  The current  implementation is to
preserve the system rather than data.  If you want to have full reliability then I recommend
to write to file and use utf8filetailing adaptor to ensure all entries transportation are
tracked.  In production, there are usually a lot of collectors for both high availability
and throughput, hence agents are not likely to fill up in memory queue.  However, there are
still areas for improvement, ie. add algorithm to discard most recent data or oldest data.
 Patches are welcome. :)
>
> Sent from my iPhone
>
> On Dec 29, 2011, at 10:38 PM, 陈镇海 <higher1128@gmail.com> wrote:
>
>> Hi Eric,
>> When no collector is available,data is stored in memory queue.In this
>> case , if the amount of data is large and the memory size is limited.
>> Will it be "out of memory"  and whether the data will be lost?
>>
>> 2011/12/30 Eric Yang <eric818@gmail.com>:
>>> Hi,
>>>
>>> Data is stored in Agent in memory queue.  Agent queues messages if no
>>> collector is available.  The reason that data is out of order in
>>> chukwa/repos because data does not contain a time stamp.  The demux
>>> parser does not know how to sort the given data, hence the data is
>>> stored in random order.  You might be able to improve the order of the
>>> data by modifying the demux parser to use SeqID for ordering to get
>>> original order.  Hope this helps.
>>>
>>> regards,
>>> Eric
>>>
>>> On Thu, Dec 29, 2011 at 6:22 PM, 陈镇海 <higher1128@gmail.com> wrote:
>>>> Hello,
>>>> I'm using chukwa-0.4.0. The agent and collector are in the same
>>>> machine. When I use UDPAdaptor, I found a problem.
>>>> The initial_adaptor is written "add UDPAdaptor Packets 1234 0". After
>>>> start agent,collector and start_data_processor, I use "nc" to send
>>>> some data to this udp port as followed:
>>>> echo "hello" | nc -u 127.0.0.1
>>>> echo "world" | nc -u 127.0.0.1
>>>> echo "this is a test" | nc -u 127.0.0.1
>>>> echo "good job" | nc -u 127.0.0.1
>>>> echo "OK" | nc -u 127.0.0.1
>>>> After it works for a while, I found something was written in HDFS. In
>>>> the directory "/chukwa/dataSinkArchives", I found the data was written
>>>> in correct order. But in the directory "/chukwa/repos", I found the
>>>> data was written in a wrong order as followed:
>>>> ............body this is  a test
>>>> ............body OK
>>>> ............body good job
>>>> ............body hello
>>>> ............body world
>>>> How it happened?
>>>> Another problem,when I keep the agent running and stop the collector,
>>>> I continue to send data to the udp port.After a while,when I start the
>>>> collector,I found the data was not lost.I want to know how and where
>>>> the data is stored.
>>>> Thanks a lot.

Mime
View raw message