incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerome Boulon <>
Subject Re: SocketTeeWriter
Date Tue, 11 May 2010 00:18:12 GMT
I may start working soon on a chunk forwarder that will work at the
collector level exactly to send monitoring metrics to a real time system.
The requirements are the same as the ones I have for my client library:
should be able:
- to filter the data that will be sent over
- to buffer some chunks in memory
- to retry/fail over automatically
- drop on the floor if the data cannot be sent over

I was planning to reuse most of the code that I have built for Honu here at

I've reach my 2d milestone on Honu (sending 700M rows per day) so this
should be the next thing on my todo list.

- So what is your time range to get something?
- Is the SocketTee something that is production ready? (I mean able to
support more than 300M events/day)?


On 5/10/10 4:37 PM, "Ariel Rabkin" <> wrote:

> That's how we use it at Berkeley, to process metrics from hundreds of
> machines; total data rate less than a megabyte per second, though.
> What scale of data are you looking at?
> The intent of SocketTee was if you need some subset of the data now,
> while write-to-HDFS-and-process-with-Hadoop is still the default path.
>  What sort of low-latency processing do you need?
> --Ari
> On Mon, May 10, 2010 at 4:28 PM, Corbin Hoenes <> wrote:
>> Has anyone used the "Tee" in a larger scale deployment to try to get
>> real-time/low latency data?  Interested in how feasible it would be to use it
>> to pipe data into another system to handle these low latency requests and
>> leave the long term analysis to hadoop.

View raw message