avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arvind Kalyan <bas...@gmail.com>
Subject Re: GenericData.Record vs specific generated avro object
Date Thu, 20 Feb 2014 16:32:58 GMT
My guess is you are not deserializing it properly (if at all)

Can you share the relevant code that's within your mapper?
 On Feb 19, 2014 9:53 PM, "AnilKumar B" <akumarb2010@gmail.com> wrote:

> Hi,
>
> I am trying to process avro data using mapreduce. The data which I get in
> avro format is generated by flume in below format.
>
>
> {"type":"record","name":"Event","fields":[{"name":"headers","type":{"type":"map","values":"string"}},{"name":"body","type":"bytes"}]}
>
>
> And data sample is as below:
>
> {"headers": {"timestamp": "1392825607332", "parentnode": "2014021909\/1392825638009"},
> "body": {"bytes":
> "{"row":"000372d8","data":{"x1":"v1","x2":"v2","x3":"v3"},"timestamp":1392380848474}"}}
>
> But when I want to use this data in Mapreduce, I am trying to read this
> data as AvroKey<GenericData.Record>, NullWritable in mapper. I am able to
> get the whole message when I see key.datum(), I am unable access the fields
> like "row",  "data", "timestamp".
>
>
> So how can I resolve this? Do I need to generate specific avro java class
> for below schema and should I use generated class for processing in
> Mapreduce or Should I use GenericData.Record itself?
>
>
> {
>
>   "namespace": "com.test.avro",
>
>   "type": "record",
>
>   "name": "Event",
>
>   "fields": [
>
>     {
>
>       "name": "row",
>
>       "type": "string"
>
>     },
>
>     {
>
>       "name": "data",
>
>       "type": {
>
>         "type": "map",
>
>         "values": "string"
>
>       }
>
>     },
>
>     {
>
>       "name": "timestamp",
>
>       "type": "string"
>
>     }
>
>   ]
>
> }
>
>
> Thanks & Regards,
> B Anil Kumar.
>

Mime
View raw message