avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Briggs <wrbri...@gmail.com>
Subject Re: Converting Protobuf object to Avro
Date Mon, 24 Aug 2015 23:47:21 GMT
Have you looked at the ProtoBuffData utility class? The getSchema method
might do the trick for you:
http://avro.apache.org/docs/1.6.1/api/java/org/apache/avro/protobuf/ProtobufData.html#getSchema(java.lang.Class)

On Mon, Aug 24, 2015, 4:56 PM Lan Jiang <ljiang2@gmail.com> wrote:

> Sean,
>
> Thanks for the reply.
>
> Your suggestion kind of makes sense. The default example wraps a
> GenericDatumWriter with a DataFileWriter. Then call the create/append/close
> method on DataFileWriter in sequence to write out the container file.
>
> Now my problem of using ProtobufDataWriter in a similar fashion is that I
> do not have an avro schema object in the method call
> dataFileWriter.create(schema, file). As I understand, the protobuf-avro
> should have a way to convert the protobuf schema to avro schema for you
> automatically. I have not found any utility class to do the schema
> conversion.  Correct me if I am wrong.
>
> Lan
>
>
>
> On Aug 24, 2015, at 3:14 PM, Sean Busbey <busbey@cloudera.com> wrote:
>
> Hiya Lan!
>
> You need to use a container file instead of just writing via the datum
> writer yourself.
>
> Take a look at the "Getting Started (Java)" section on serialization[1].
> The example there uses the GenericDatumWriter, but you ought to be able to
> switch it out for your ProtobufDatumWriter.
>
>
>
>
> [1]:
> http://avro.apache.org/docs/1.7.7/gettingstartedjava.html#Serializing-N101DE
>
> On Mon, Aug 24, 2015 at 12:54 PM, Lan Jiang <ljiang2@gmail.com> wrote:
>
>> Hi, there
>>
>> I am trying to convert a protobuf object to Avro. I am using
>>
>> //myProto object is deserialized using google protobuf API
>> ProtobufDatumWriter<MyProto> pbWriter = new
>> ProtobufDatumWriter<MyProto>(MyProto.class);
>> FileOutputStream fo = new FileOutputStream(args[0]);
>> Encoder e = EncoderFactory.get().binaryEncoder(fo, null);
>> pbWriter.write(myProto, e);
>> fo.flush();
>>
>> The avro file was created successfully. If I cat the file, I can see the
>> data in the file. However, when I tried to use avro-tools to get schema or
>> meta info about the saved avro file, it says
>>
>> Exception in thread "main" java.io.IOException: Not a data file.
>> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:105)
>> at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
>> at
>> org.apache.avro.tool.DataFileGetSchemaTool.run(DataFileGetSchemaTool.java:47)
>>
>> Look at the Avro source code, the error means it does not have the first
>> 4 bytes matching the MAGIC first 4 bytes. I am trying to see if I have done
>> anything wrong.
>>
>> Appreciate any help you can give me.
>>
>> Lan
>>
>
>
>
> --
> Sean
>
>
>

Mime
View raw message