chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Popa Nicolae <pnflav...@gmail.com>
Subject Re: ZIP format/ S3 storage system support
Date Wed, 10 Jan 2018 13:18:17 GMT
Thank you for your response.

My target is to create a tool capable of grepping through large data sets
of logs (e.g: the size of these sets range from 1TB onwards) and offer
answers to queries in reasonable amount of time (e.g: from seconds to
several minutes, at most 1 hour). The logs are placed in S3 (e.g: the logs
are produced by EMR jobs) in a compressed format (e.g: gzip or LZO). I will
expect some performance tuning to be done in the end in order accomplish my
performance targets.

I don't know your current roadmap, but I will like to contribute to Chukwa
by providing support for reading/storing compressed logs for different
formats (e.g: gzip, bzip2, LZO, Snappy, etc.). Moreover, I will test Chukwa
with S3 as input source and see if it works and contribute here too if
necessary. Are you interested in these kind of contributions ? Does your
roadmap include any performance tuning tasks?

Nicolae

On 9 January 2018 at 18:33, Popa Nicolae <pnflavian@gmail.com> wrote:

> Hello guys,
>
> I am new to Apache Chukwa and I was exploring the possibility to use it
> for one of my use cases. While I was reading the documentation I didn't
> find any mention about zip format support or S3 storage system.
>
> 1. Does Chukwa support reading and storing ZIP archives?
>
> 2. Besides HDFS file system, does Chukwa support reading/writing to Amazon
> S3 storage?
>
> Thank you,
> Flavian
>
>

Mime
View raw message