flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: Using Flume to process data
Date Wed, 03 Sep 2014 21:21:41 GMT
You should be able to accomplish this with the Morplhines
intercepter[1]. It will let you build a configuration file that
converts from JSON to CSV. There's a similar example, though the
target is Avro rather than JSON, in the Kite project[2]. The full docs
for Morphlines will also be helpful[3].


[1] http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor
[2] https://github.com/kite-sdk/kite-examples/tree/master/json
[3] http://kitesdk.org/docs/current/kite-morphlines/index.html

On Wed, Sep 3, 2014 at 4:26 PM, Sid Ray <sid@fractalsciences.com> wrote:
> Can you guys please let me know if the following scenario is supported:
> I have a system in which there are Tomcat machines which have small JSON
> files of 2K size each. The goal is to take those files, convert them to CSV
> format and upload them to S3. Then from S3 they are loaded in parallel to
> Redshift.
> My idea of the architecture was that:
> TomcatServer1   --------------
>                                        |
> TomcatServer2   --------------> Flume---->S3
> Is it possbile in Flume we can do the conversion from the JSON file to CSV
> files. The idea is that we need to take the contents of the JSON file, do
> some database lookup, fetch the id and then create the CSV file out of that.
> Is it possible to do this processing in Flume.
> Also, what will the HA architecture of Flume look like. Any links etc.
> Thanks,
> Sid

Joey Echeverria

View raw message