flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zzz <squiggly...@gmail.com>
Subject Re: getting Avro into Flume
Date Thu, 18 Sep 2014 00:25:58 GMT
Thanks for the quick reply Hari.

When you say send data to Flume using the RPC Client API, do you mean send
it to the Avro Source? If not, which source? Because that is currently what
I am trying to do. I wasn't sure if encoding Avro data as byte[] and
sending it to the Avro Source was a valid approach, but from what you are
saying there is a way for sources (at least the HDFS source) to recognize
the encoded Avro data. I hope the Solr source can be made to be similarly

Would encoding the Avro data as byte[] and sending it to flume via the HTTP
interface also work?

I was actually having trouble converting an Avro object to a byte[] array
to start with...but I will try that again.

On Thu, Sep 18, 2014 at 10:16 AM, Hari Shreedharan <
hshreedharan@cloudera.com> wrote:

> No, the Avro Source is an RPC source. To send data to Flume use the RPC
> client API (https://flume.apache.org/FlumeDeveloperGuide.html#client).
> Just encode your Avro data as byte[] and use the AVRO_EVENT serializer
> while writing to HDFS.
> Thanks,
> Hari
> On Wed, Sep 17, 2014 at 5:13 PM, zzz <squiggly101@gmail.com> wrote:
>> I am using Cloudera CDH 5.1 and running a Flume agent configured by
>> Cloudera manager.
>> I would like to send Avro data to Flume, and I was assuming the Avro
>> Source would be the appropriate method to send data in this way.
>> However, the examples of Java clients that send data via the Avro Source,
>> send simple strings, not Avro objects to be serialized, e.g. the example
>> here: https://flume.apache.org/FlumeDeveloperGuide.html
>> And the examples of Avro serialization all seem to be able serializing to
>> disk.
>> In my use case, I am basically receiving a real-time stream of JSON
>> documents, which I am able to convert to Avro objects, and would like to
>> put them into Flume. I would then like to be able to index this Avro data
>> in Solr via the Solr sink, and convert it to Parquet format in HDFS using
>> the HDFS sink.
>> Is this possible or am I coming about this the wrong way?

View raw message