avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: Serializing / Deserializing Java Objects
Date Wed, 16 Jun 2010 03:10:36 GMT

QueueItem myItem = new QueueItem();
GenericArray<Utf8> cols = new GenericArray<Utf8>( ... ) ...

Since the Columns field is public, instead of:

myItem.put(index, cols);

do:
myItem.Columns = cols;



On Jun 15, 2010, at 7:54 PM, Bradford Stephens wrote:

> Ah, interesting.
> 
> Then, is there a way to avoid manually making the .put(int, object)
> call that relies on the magic number?
> 
> Or rather, what is the best practice for getting my Java object data
> into a generated Avro class so that it can be written?
> 
> -B
> 
> 
> 
> On Tue, Jun 15, 2010 at 7:44 PM, Scott Carey <scott@richrelevance.com> wrote:
>> This iteration of the SpecificAPI simply has public fields that are intended to be
set directly.
>> 
>> The current best practice is to use wrapper classes or static helpers to interact
with the generated objects so that most of your code is abstracted from the implementation
details.
>> 
>> put(field, value) is there for other internal Avro code moreso than users -- specifically
it allows a ResolvingDecoder to automatically figure out where the data goes if the reader
and writer's schemas differ.
>> 
>> Definitely do NOT depend on the 'magic number' in your code.   We should document
that better.  There is some discussion about the future of the Specific API so that it can
generate getters/setters, and provide user controlled features -- potentially something like
whether to use String[] or List<String> or Utf8[], etc to represent data in memory.
 More suggestions on how to improve the API are welcome.
>> 
>> -Scott
>> 
>> On Jun 15, 2010, at 7:32 PM, Bradford Stephens wrote:
>> 
>>> Another thing to help me understand the Avro philosophy...
>>> 
>>> When doing, public void put(int field$, java.lang.Object value$)
>>> 
>>> Why is field an integer?
>>> 
>>> For instance, I have a String[] Column in my object. In protobuf, it
>>> would generate java methods like .putColumn(String[] item). Is there a
>>> reason avro can't do that? Or did I run the code generator in
>>> avro-tools wrong?
>>> 
>>> If that doesn't work, could we generate an enum of field names to pass
>>> in, instead? I don't like having to know "Magic Numbers" :)
>>> 
>>> Cheers,
>>> B
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Jun 15, 2010 at 7:26 PM, Bradford Stephens
>>> <bradfordstephens@gmail.com> wrote:
>>>> That's.... erm, kinda bizarre.
>>>> 
>>>> But hey, it works! Thanks!
>>>> 
>>>> 
>>>> 
>>>> On Tue, Jun 15, 2010 at 6:56 PM, Scott Carey <scott@richrelevance.com>
wrote:
>>>>> Use GenericArray.  The schema given to the generic array is not the schema
of the member elements, but the actual array schema (yes it is confusing).
>>>>> 
>>>>> new GenericData.Array<Utf8>(size, Schema.createArray(Schema.create(Type.STRING));
>>>>> 
>>>>> It would be useful to be able to simply use Utf8[] or List<Utf8>
for the Specific API, but at this time it leverages GenericData.
>>>>> 
>>>>> 
>>>>> On Jun 15, 2010, at 6:25 PM, Bradford Stephens wrote:
>>>>> 
>>>>>> That makes sense -- I'm getting errors during serialization, though.
>>>>>> 
>>>>>> I convert my List<String> to Utf8[].
>>>>>> 
>>>>>> I then do a QueueItem.put() and get "Exception in thread "main"
>>>>>> java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot
be
>>>>>> cast to org.apache.avro.generic.GenericArray"
>>>>>> 
>>>>>> How do I handle this Java->Avro interop? It seems to me that it
should
>>>>>> be a lot simpler...
>>>>>> 
>>>>>> If I try to create a GenericArray<Utf8> and add items to that,
it
>>>>>> complains that my schema doesn't look right...so that doesn't feel
>>>>>> like the right path.
>>>>>> 
>>>>>> My generated class looks like this:
>>>>>> 
>>>>>> @SuppressWarnings("all")
>>>>>> public class QueueItem extends
>>>>>> org.apache.avro.specific.SpecificRecordBase implements
>>>>>> org.apache.avro.specific.SpecificRecord {
>>>>>> public static final org.apache.avro.Schema SCHEMA$ =
>>>>>> org.apache.avro.Schema.parse("{\"type\":\"record\",\"name\":\"QueueItem\",\"namespace\":\"com.dts\",\"fields\":[{\"name\":\"Columns\",\"type\":[\"null\",{\"type\":\"array\",\"items\":\"string\"}]}]}");
>>>>>> 
>>>>>> public org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8>
>>>>>> Columns;
>>>>>> public org.apache.avro.Schema getSchema() { return SCHEMA$; }
>>>>>> public java.lang.Object get(int field$) {
>>>>>> 
>>>>>> 
>>>>>>   switch (field$) {
>>>>>>   case 0: return Columns;
>>>>>>   default: throw new org.apache.avro.AvroRuntimeException("Bad index");
>>>>>>   }
>>>>>> }
>>>>>> @SuppressWarnings(value="unchecked")
>>>>>> public void put(int field$, java.lang.Object value$) {
>>>>>>   switch (field$) {
>>>>>>   case 0: Columns =
>>>>>> (org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8>)value$;
>>>>>> break;
>>>>>>   default: throw new org.apache.avro.AvroRuntimeException("Bad index");
>>>>>>   }
>>>>>> }
>>>>>> }
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Jun 15, 2010 at 8:57 AM, Philip Zeyliger <philip@cloudera.com>
wrote:
>>>>>>> Hi Bradford,
>>>>>>> I believe you use a SpecificDatumReader.  Something like:
>>>>>>> 
>>>>>>>  final static SpecicificDatumReader<QueueItem> QUEUE_ITEM_READER
= new
>>>>>>> SpecificDatumReader<QueueItem>(QueueItem.class);
>>>>>>>  QueueItem q = QUEUE_ITEM_READER.read(null, decoder);
>>>>>>> There doesn't seem to be a test that exercises this code path
in an
>>>>>>> explanatory way, but java/src/java/org/apache/avro/ipc/Requestor.java
uses
>>>>>>> something quite similar.
>>>>>>> -- Philip
>>>>>>> 
>>>>>>> On Mon, Jun 14, 2010 at 9:20 PM, Bradford Stephens
>>>>>>> <bradfordstephens@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Greetings,
>>>>>>>> 
>>>>>>>> I've poked around for examples of this, but I can't find
any. I
>>>>>>>> imagine it's a fairly common use case.
>>>>>>>> 
>>>>>>>> I'm serializing some simple objects into bytes for placement
onto
>>>>>>>> RabbitMQ. My java class is pretty simple (but it'll grow
in complexity
>>>>>>>> in time).:
>>>>>>>> 
>>>>>>>> {
>>>>>>>> String[] Columns;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> 
>>>>>>>> So, I made a .json schema containing this:
>>>>>>>> {
>>>>>>>>     "namespace": "com.dts",
>>>>>>>>     "name": "QueueItem",
>>>>>>>>     "type": "record",
>>>>>>>>     "fields": [
>>>>>>>>         {"name": "Columns", "type": ["null", {"type": "array",
>>>>>>>> "items":"string"}]}
>>>>>>>>     ]
>>>>>>>> }
>>>>>>>> 
>>>>>>>> 
>>>>>>>> And generated a java class ...
>>>>>>>> 
>>>>>>>> Now, I'm writing a test to serialize and deserialize some
items. I can
>>>>>>>> figure out the serialization, but not deserialization back
to objects.
>>>>>>>> 
>>>>>>>>       Schema s = Schema.parse(new File("queuetype.json"));
>>>>>>>> 
>>>>>>>>           ByteArrayOutputStream bao = new ByteArrayOutputStream();
>>>>>>>>           GenericDatumWriter w = new GenericDatumWriter(s);
>>>>>>>>           Encoder e = new BinaryEncoder(bao);
>>>>>>>>           e.init (bao);
>>>>>>>> 
>>>>>>>> 
>>>>>>>>           QueueItem r = new QueueItem();
>>>>>>>>           r.put(0, items);
>>>>>>>>           w.write(r, e);
>>>>>>>>           e.flush();
>>>>>>>> 
>>>>>>>>           ByteArrayInputStream is = new
>>>>>>>> ByteArrayInputStream(bao.toByteArray());
>>>>>>>>           DecoderFactory df = new DecoderFactory();
>>>>>>>>           Decoder d = df.createBinaryDecoder(is, null);
>>>>>>>> 
>>>>>>>>           QueueItem itemout = (QueueItem)  .....
>>>>>>>> 
>>>>>>>> 
>>>>>>>> And that's what I can't figure out -- how do I use a decoder
method to
>>>>>>>> create QueueItems?
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> B
>>>>>>>> 
>>>>>>>> radford Stephens,
>>>>>>>> Founder, Drawn to Scale
>>>>>>>> drawntoscalehq.com
>>>>>>>> 727.697.7528
>>>>>>>> 
>>>>>>>> http://www.drawntoscalehq.com --  The intuitive, cloud-scale
data
>>>>>>>> solution. Process, store, query, search, and serve all your
data.
>>>>>>>> 
>>>>>>>> http://www.roadtofailure.com -- The Fringes of Scalability,
Social
>>>>>>>> Media, and Computer Science
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Bradford Stephens,
>>>>>> Founder, Drawn to Scale
>>>>>> drawntoscalehq.com
>>>>>> 727.697.7528
>>>>>> 
>>>>>> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
>>>>>> solution. Process, store, query, search, and serve all your data.
>>>>>> 
>>>>>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>>>>>> Media, and Computer Science
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Bradford Stephens,
>>>> Founder, Drawn to Scale
>>>> drawntoscalehq.com
>>>> 727.697.7528
>>>> 
>>>> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
>>>> solution. Process, store, query, search, and serve all your data.
>>>> 
>>>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>>>> Media, and Computer Science
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Bradford Stephens,
>>> Founder, Drawn to Scale
>>> drawntoscalehq.com
>>> 727.697.7528
>>> 
>>> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
>>> solution. Process, store, query, search, and serve all your data.
>>> 
>>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>>> Media, and Computer Science
>> 
>> 
> 
> 
> 
> --
> Bradford Stephens,
> Founder, Drawn to Scale
> drawntoscalehq.com
> 727.697.7528
> 
> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
> solution. Process, store, query, search, and serve all your data.
> 
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science


Mime
View raw message