avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: Serializing / Deserializing Java Objects
Date Wed, 16 Jun 2010 02:44:56 GMT
This iteration of the SpecificAPI simply has public fields that are intended to be set directly.
 

The current best practice is to use wrapper classes or static helpers to interact with the
generated objects so that most of your code is abstracted from the implementation details.
 

put(field, value) is there for other internal Avro code moreso than users -- specifically
it allows a ResolvingDecoder to automatically figure out where the data goes if the reader
and writer's schemas differ.  

Definitely do NOT depend on the 'magic number' in your code.   We should document that better.
 There is some discussion about the future of the Specific API so that it can generate getters/setters,
and provide user controlled features -- potentially something like whether to use String[]
or List<String> or Utf8[], etc to represent data in memory.  More suggestions on how
to improve the API are welcome.

-Scott

On Jun 15, 2010, at 7:32 PM, Bradford Stephens wrote:

> Another thing to help me understand the Avro philosophy...
> 
> When doing, public void put(int field$, java.lang.Object value$)
> 
> Why is field an integer?
> 
> For instance, I have a String[] Column in my object. In protobuf, it
> would generate java methods like .putColumn(String[] item). Is there a
> reason avro can't do that? Or did I run the code generator in
> avro-tools wrong?
> 
> If that doesn't work, could we generate an enum of field names to pass
> in, instead? I don't like having to know "Magic Numbers" :)
> 
> Cheers,
> B
> 
> 
> 
> 
> On Tue, Jun 15, 2010 at 7:26 PM, Bradford Stephens
> <bradfordstephens@gmail.com> wrote:
>> That's.... erm, kinda bizarre.
>> 
>> But hey, it works! Thanks!
>> 
>> 
>> 
>> On Tue, Jun 15, 2010 at 6:56 PM, Scott Carey <scott@richrelevance.com> wrote:
>>> Use GenericArray.  The schema given to the generic array is not the schema of
the member elements, but the actual array schema (yes it is confusing).
>>> 
>>> new GenericData.Array<Utf8>(size, Schema.createArray(Schema.create(Type.STRING));
>>> 
>>> It would be useful to be able to simply use Utf8[] or List<Utf8> for the
Specific API, but at this time it leverages GenericData.
>>> 
>>> 
>>> On Jun 15, 2010, at 6:25 PM, Bradford Stephens wrote:
>>> 
>>>> That makes sense -- I'm getting errors during serialization, though.
>>>> 
>>>> I convert my List<String> to Utf8[].
>>>> 
>>>> I then do a QueueItem.put() and get "Exception in thread "main"
>>>> java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot be
>>>> cast to org.apache.avro.generic.GenericArray"
>>>> 
>>>> How do I handle this Java->Avro interop? It seems to me that it should
>>>> be a lot simpler...
>>>> 
>>>> If I try to create a GenericArray<Utf8> and add items to that, it
>>>> complains that my schema doesn't look right...so that doesn't feel
>>>> like the right path.
>>>> 
>>>> My generated class looks like this:
>>>> 
>>>> @SuppressWarnings("all")
>>>> public class QueueItem extends
>>>> org.apache.avro.specific.SpecificRecordBase implements
>>>> org.apache.avro.specific.SpecificRecord {
>>>>  public static final org.apache.avro.Schema SCHEMA$ =
>>>> org.apache.avro.Schema.parse("{\"type\":\"record\",\"name\":\"QueueItem\",\"namespace\":\"com.dts\",\"fields\":[{\"name\":\"Columns\",\"type\":[\"null\",{\"type\":\"array\",\"items\":\"string\"}]}]}");
>>>> 
>>>>  public org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8>
>>>> Columns;
>>>>  public org.apache.avro.Schema getSchema() { return SCHEMA$; }
>>>>  public java.lang.Object get(int field$) {
>>>> 
>>>> 
>>>>    switch (field$) {
>>>>    case 0: return Columns;
>>>>    default: throw new org.apache.avro.AvroRuntimeException("Bad index");
>>>>    }
>>>>  }
>>>>  @SuppressWarnings(value="unchecked")
>>>>  public void put(int field$, java.lang.Object value$) {
>>>>    switch (field$) {
>>>>    case 0: Columns =
>>>> (org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8>)value$;
>>>> break;
>>>>    default: throw new org.apache.avro.AvroRuntimeException("Bad index");
>>>>    }
>>>>  }
>>>> }
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Tue, Jun 15, 2010 at 8:57 AM, Philip Zeyliger <philip@cloudera.com>
wrote:
>>>>> Hi Bradford,
>>>>> I believe you use a SpecificDatumReader.  Something like:
>>>>> 
>>>>>   final static SpecicificDatumReader<QueueItem> QUEUE_ITEM_READER
= new
>>>>> SpecificDatumReader<QueueItem>(QueueItem.class);
>>>>>   QueueItem q = QUEUE_ITEM_READER.read(null, decoder);
>>>>> There doesn't seem to be a test that exercises this code path in an
>>>>> explanatory way, but java/src/java/org/apache/avro/ipc/Requestor.java
uses
>>>>> something quite similar.
>>>>> -- Philip
>>>>> 
>>>>> On Mon, Jun 14, 2010 at 9:20 PM, Bradford Stephens
>>>>> <bradfordstephens@gmail.com> wrote:
>>>>>> 
>>>>>> Greetings,
>>>>>> 
>>>>>> I've poked around for examples of this, but I can't find any. I
>>>>>> imagine it's a fairly common use case.
>>>>>> 
>>>>>> I'm serializing some simple objects into bytes for placement onto
>>>>>> RabbitMQ. My java class is pretty simple (but it'll grow in complexity
>>>>>> in time).:
>>>>>> 
>>>>>> {
>>>>>> String[] Columns;
>>>>>> }
>>>>>> 
>>>>>> 
>>>>>> So, I made a .json schema containing this:
>>>>>> {
>>>>>>      "namespace": "com.dts",
>>>>>>      "name": "QueueItem",
>>>>>>      "type": "record",
>>>>>>      "fields": [
>>>>>>          {"name": "Columns", "type": ["null", {"type": "array",
>>>>>> "items":"string"}]}
>>>>>>      ]
>>>>>> }
>>>>>> 
>>>>>> 
>>>>>> And generated a java class ...
>>>>>> 
>>>>>> Now, I'm writing a test to serialize and deserialize some items.
I can
>>>>>> figure out the serialization, but not deserialization back to objects.
>>>>>> 
>>>>>>        Schema s = Schema.parse(new File("queuetype.json"));
>>>>>> 
>>>>>>            ByteArrayOutputStream bao = new ByteArrayOutputStream();
>>>>>>            GenericDatumWriter w = new GenericDatumWriter(s);
>>>>>>            Encoder e = new BinaryEncoder(bao);
>>>>>>            e.init (bao);
>>>>>> 
>>>>>> 
>>>>>>            QueueItem r = new QueueItem();
>>>>>>            r.put(0, items);
>>>>>>            w.write(r, e);
>>>>>>            e.flush();
>>>>>> 
>>>>>>            ByteArrayInputStream is = new
>>>>>> ByteArrayInputStream(bao.toByteArray());
>>>>>>            DecoderFactory df = new DecoderFactory();
>>>>>>            Decoder d = df.createBinaryDecoder(is, null);
>>>>>> 
>>>>>>            QueueItem itemout = (QueueItem)  .....
>>>>>> 
>>>>>> 
>>>>>> And that's what I can't figure out -- how do I use a decoder method
to
>>>>>> create QueueItems?
>>>>>> 
>>>>>> Cheers,
>>>>>> B
>>>>>> 
>>>>>> radford Stephens,
>>>>>> Founder, Drawn to Scale
>>>>>> drawntoscalehq.com
>>>>>> 727.697.7528
>>>>>> 
>>>>>> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
>>>>>> solution. Process, store, query, search, and serve all your data.
>>>>>> 
>>>>>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>>>>>> Media, and Computer Science
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Bradford Stephens,
>>>> Founder, Drawn to Scale
>>>> drawntoscalehq.com
>>>> 727.697.7528
>>>> 
>>>> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
>>>> solution. Process, store, query, search, and serve all your data.
>>>> 
>>>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>>>> Media, and Computer Science
>>> 
>>> 
>> 
>> 
>> 
>> --
>> Bradford Stephens,
>> Founder, Drawn to Scale
>> drawntoscalehq.com
>> 727.697.7528
>> 
>> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
>> solution. Process, store, query, search, and serve all your data.
>> 
>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>> Media, and Computer Science
>> 
> 
> 
> 
> -- 
> Bradford Stephens,
> Founder, Drawn to Scale
> drawntoscalehq.com
> 727.697.7528
> 
> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
> solution. Process, store, query, search, and serve all your data.
> 
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science


Mime
View raw message