chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <>
Subject Re: ZIP format/ S3 storage system support
Date Wed, 10 Jan 2018 15:38:00 GMT
Chukwa is designed to tail log files, and stream delta to HDFS as soon as
the log is written.  There is artificial delay put in place of a few
milliseconds to throttle CPU usage.  We haven't done much performance
analysis, but there is probably a lot room for optimizations to make Chukwa
faster.  It will be interesting to see support for compressed logs, and
patches are welcome.


On Wed, Jan 10, 2018 at 5:18 AM, Popa Nicolae <> wrote:

> Thank you for your response.
> My target is to create a tool capable of grepping through large data sets
> of logs (e.g: the size of these sets range from 1TB onwards) and offer
> answers to queries in reasonable amount of time (e.g: from seconds to
> several minutes, at most 1 hour). The logs are placed in S3 (e.g: the logs
> are produced by EMR jobs) in a compressed format (e.g: gzip or LZO). I will
> expect some performance tuning to be done in the end in order accomplish my
> performance targets.
> I don't know your current roadmap, but I will like to contribute to Chukwa
> by providing support for reading/storing compressed logs for different
> formats (e.g: gzip, bzip2, LZO, Snappy, etc.). Moreover, I will test Chukwa
> with S3 as input source and see if it works and contribute here too if
> necessary. Are you interested in these kind of contributions ? Does your
> roadmap include any performance tuning tasks?
> Nicolae
> On 9 January 2018 at 18:33, Popa Nicolae <> wrote:
>> Hello guys,
>> I am new to Apache Chukwa and I was exploring the possibility to use it
>> for one of my use cases. While I was reading the documentation I didn't
>> find any mention about zip format support or S3 storage system.
>> 1. Does Chukwa support reading and storing ZIP archives?
>> 2. Besides HDFS file system, does Chukwa support reading/writing to
>> Amazon S3 storage?
>> Thank you,
>> Flavian

View raw message